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COMPOSITIONS AND METHODS FOR HYDROXYLATING 
EPOTHILONES 

5 

Field of the Invention 

The present invention relates to isolated nucleic acids sequences and 
polypeptides encoded thereby for epothilone B hydroxylase and mutants and variants 
thereof, and a ferredoxin located downstream from the epothilone B hydroxylase 
10 gene. The present invention also relates to recombinant microorganisms expressing 
epothilone B hydroxylase or a mutant or variant thereof and/or ferredoxin which are 
capable of hydroxylating small organic molecule compounds, such as epothilones, 
having a terminal alkyl group to produce compounds having a terminal hydroxyalkyl 
group. Also provided are methods for recombinantly producing such microorganisms 
15 as well as methods for using these recombinant microorganisms in the synthesis of 
compounds having a terminal hydroxylalkyl group. The compositions and methods 
of the present invention are useful in preparation of epothilones having a variety of 
utilities in the pharmaceutical field. A novel epothilone analog produced using a 
mutant of epothilone B hydroxylase of the present invention is also described. 

Background of the Invention 

Epothilones are macrolide compounds that find utility in the pharmaceutical 
field. For example, epothilones A and B having the structures: 



have been found to exert microtubule-stabilizing effects similar to paclitaxel 
(TAXOL®) and hence cytotoxic activity against rapidly proliferating cells, such as, 




Epothilone A 
Epothilone B 



R=Me 



R=H 
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tumor cells or cells associated with other hyperprbliferative cellular diseases, see 
Bollag etal., Cancer Res.. VoL 55, No. 11, 2325-2333 (1995). 

Epothilones A and B are natural anticancer agents produced by Sorangium 
cellulosum that were first isolated and characterized by Hofle et al. t DE 4138042; WO 
5 93/10121; Angew. Chem. Int. Ed. End . Vol. 35, Nol3/14, 1567-1569 (1996); and L 
Antibiot, VoL 49, No. 6, 560-563 (1996). Subsequently, the total syntheses of 
epothilones A and B have been published by Balog et al., Angew. Chem. Int Ed. 
Engl ., Vol. 35, No. 23/24, 2801-2803, 1996; Meng et al., J. Am Chem. Soc . Vol. 
1 19, No. 42, 10073-10092 (1997); Nicolaou et al. 9 J. Am. Chem. Soc .. Vol. 1 19, No. 

10 34, 7974-7991 (1997); Schinzer et dL % Angew. Chem. Int. Ed. Eng .. Vol. 36, No. 5, 
523-524 (1997); and Yang et al, Angew. Chem. Int Ed Engl.. Vol. 36, No. 1 / 2, 
166-168, 1997. WO 98/25929 disclosed the methods for chemical synthesis of 
epothilone A, epothilone B, analogs of epothilone and libraries of epothilone analogs. 
The structure and production from Sorangium cellulosum DSM 6773 of epothilones 

15 C, D, E, and F was disclosed in WO 98/22461. Figure 1 provides a diagram of the 
biotransformation as described in WO 00/39276 of epothilone B to epothilone F in 
Actinomycetes species strain SC15847 (ATCC PT-1043), subsequently identified as 
Amycolatopsis orienialis. 

Cytochrome P450 enzymes are found in prokaryotes and eukaryotic cells and 

20 have in common a heme binding domain which can be distinguished by an 

absoibance peak at 450 nm when complexed with carbon monoxide. Cytochrome 
P450 enzymes perform a broad spectrum of oxidative reactions on primarily 
hydrophobic substrates including aromatic and benzylic rings, and alkanes. In 
prokaryotes they are found as detoxifying systems and as a first enzymatic step in 

25 metabolizing substrates such as toluene, benzene and camphor. Cytochrome P450 
genes have also been found in biosynthetic pathways of secondary metabolites such as 
nikkomycin in Streptomyces tendae (Bruntner, C. et al, 1999, Mol. Gen. Genet 262: 
102-114), doxorubicin (Dickens, ML, Strohl, WJR., 1996, J. Bacteriol, 178: 3389- 
3395) and in the epothilone biosynthetic cluster of Sorangium cellulosum (Julien, B. 

30 et al., 2000, Gene, 249: 153-160). With a few exceptions, the cytochrome P450 
systems in prokaryotes are composed of three proteins; a ferredoxin NADH or 
NADPH dependent reductase, an iron-sulfur ferredoxin and the cytochrome P450 
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enzyme (Lewis, D.F., fflavica, P., 2000, Biochim. Biophys. Acta., 1460: 353-374). 
Electrons are transferred from ferredoxin reductase to the ferredoxin and finally to the 
cytochrome P450 enzyme for the splitting of molecular oxygen. 

5 Summary of the Invention 

An object of the present invention is to provide isolated nucleic acid sequences 
encoding epothilone B hydroxylase and variants or mutants thereof and isolated 
nucleic acid sequences encoding ferredoxin or variants or mutants thereof. 

Another object of the present invention is to provide isolated polypeptides 
10 comprising amino acid sequences of epothilone B hydroxylase and variants or 
mutants thereof and isolated polypeptides comprising amino acid sequences of 
ferredoxin and variants or mutants thereof. 

Another object of the present invention is to provide structure coordinates of 
the homology model of the epothilone B hydroxylase. The structure coordinates are 
15 listed in Appendix 1 . This model of the present invention provides a means for 

designing modulators of a biological function of epothilone B hydroxylase as well as 
additional mutants of epothilone B hydroxylase with altered specificities. 

Another object of the present invention is to provide vectors comprising 
nucleic acid sequences encoding epothilone B hydroxylase or a variant or mutant 
20 thereof and/or ferredoxin or a variant or mutant thereof. In a preferred embodiment, 
these vectors further comprise a nucleic acid sequence encoding ferredoxin. 

Another object of the present invention is to provide host cells comprising a 
vector containing a nucleic acid sequence encoding epothilone B hydroxylase or a 
variant or mutant thereof and/or ferredoxin or a variant or mutant thereof. 
25 Another object of the present invention is to provide a method for producing 

recombinant microorganisms that are capable of hydroxylating compounds, and in 
particular epothilones, having a terminal alkyl group to produce compounds having a 
terminal hydroxyalkyl group. 

Another object of the present invention is to provide microorganisms produced 
30 recombinant^ which are capable of hydroxylating compounds, and in particular 
epothilones, having a terminal alkyl group to produce compounds having a terminal 
hydroxyalkyl group. 
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Another object of the present invention is to provide methods for 
hydroxylating compounds in these recombinant microorganisms. In particular, the 
present invention provides a method for the preparation of hydroxyalkyl-bearing 
epothilones, which compounds find utility as antitumor agents and as starting 
5 materials in the preparation of other epothilone analogs. 

Yet another object of the present invention is to provide a compound of 
Formula A: 




10 referred to herein as 24-OH epothilone B or 24-OH EpoB, as well as compositions 
arid methods for production of compositions comprising the compound of Formula A. 

Brief Description of the Figures 

Figure 1 provides a schematic of the biotransformation as set forth in WO 
15 00/39276, U.S. Application Serial No. 09/468,854, filed December 21, 1999, of 

epothilone B to epothilone F by Amycolatopsis orientalis strain SC15847 (PTA1043). 
Figure 2 shows the nucleic acid sequence alignments of SEQ ID NO:5 through 

SEQ ID NO:22 used to design the PCR primers for cloning of the nucleic acid 

sequence encoding epothilone B hydroxylase. 
20 Figure 3 shows the sequence alignment between epothilone B hydroxylase 

(SEQ ID NO:2) and EryF (PDB code 1 JIN chain A; SEQ ID NO:76). The asterisks 

indicate sequence identities, the colons (:) similar residues. 

Figure 4 provides a homology model of epothilone B hydroxylase based upon 

sequence alignment with EryF as shown in Figure 3. 
25 Figure 5 shows an energy plot of the epothilone B hydroxylase model 

(indicated by dashed line) relative to EryF (PDB code 1 JIN; indicated by solid line). 

An averaging window size of 51 residues was used, i.e., the energy at a given residue 
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position is calculated as the average of the energies of the 5 1 residues in the sequence 
that lie with the given residue at the central positions. 

Detailed Description of the Invention 



polypeptides and methods for obtaining compounds with desired substituents at a 
terminal carbon position. la particular, the present invention provides compositions 
and methods for the preparation of hydroxyalkyl-bearing epothilones, which 

10 compounds find utility as antitumor agents and as starting materials in the preparation 
of other epothilone analogs. 

The term "epothilone," as used herein, denotes compounds containing an 
epothilone core and a side chain group as defined herein. The term "epothilone core," 
as used herein, denotes a moiety containing the core structure (with the numbering of 

15 ring system positions used herein shown): 



5 



The present invention relates to isolated nucleic acid sequences and 




wherein the substituents are as follows: 



Q is selected from the group consisting of 



20 




and 



WisOorNR*; 

X is selected from the group consisting of O, H and OR7; 
MisO,S, NRs,CR 9 Rio; 

Bi and B 2 are selected from the group consisting of ORn, OCORn; 
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R1-R5 and R12-R17 are selected from the group consisting of H, alkyl, 
substituted alkyl, aryl, and heterocyclo, and wherein Ri and R 2 are alkyl they can be 
joined to form a cycloalkyl; 

R$ is selected from the group consisting of H, alkyl, and substituted alkyl; 
5 R 7 and Rn are selected from the group consisting of H, alkyl, substituted 

alkyl, trialkylsilyl, alkyldiarylsilyl and dialkylarykilyl; 

Rs is selected from the group consisting of H, alkyl, substituted alkyl, Ri 3 C=0, 
R 14 OC=0 and R15SO2; and 

R9 and Rio are selected from the group consisting of H, halogen, alkyl, 
10 substituted alkyl, aryl, heteiocyclo, hydroxy, Ri$C=0, and R17OOO. 

The term "side chain group" refers to substituent G as defined above for 
Epothilone A or B or G\ and G2 as shown below. 
Gi is the following formula V 

HO-CHKAi) n -(Q)m-(A2)o (V), 

15 and 

G 2 is the following formula VI 

CH 3 -(Ai) n -(Q) m -(A2)o (VI), 

where 

Ai and A2 are independently selected from the group of optionally substituted 
20 Q-C3 alkyl and alkenyl; 

Q is optionally substituted ring system containing one to three rings and at 
least one carbon to carbon double bond in at least one ring; and 

n, m, and o are integers independently selected from the group consisting of 
zero and 1, where at least one of m, n or o is 1. 
25 The term "terminal carbon" or "terminal alkyl group" refers to the terminal 

carbon or terminal methyl group of the moiety either directly bonded to the epothilone 
core at position 15 or to the terminal carbon or terminal alkyl group of the side chain 
group bonded at position 15. It is understood that the term "alkyl group" includes 
alkyl and substituted alkyl as defined herein. 
30 The term "alkyl" refers to optionally substituted, straight or branched chain 

saturated hydrocarbon groups of 1 to 20 carbon atoms, preferably 1 to 7 carbon atoms. 
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The expression "lower alkyl" refers to optionally substituted alkyl groups of 1 to 4 
carbon atoms. 

The term "substituted alkyF refers to an alkyl group substituted by, for 
example, one to four substituents, such as, halo, trifluoromethyl, trifluoromethoxy, 
5 hydroxy, alkoxy, cycloalkyloxy, heterocyclooxy, oxo, alkanoyl, aryloxy, alkanoyloxy, 
amino, alkylamino, arylamino, aralkylamino, cycloalkylamino, heterocycloamino, 
disubstituted amines in which the 2 amino substituents are selected from alkyl, aryl or 
aralkyl, alkanoylamino, aroylamino, aralkanoylamino, substituted alkanoylamino, 
substituted arylamino, substituted aralkanoylamino, thiol, alkylthio, arylthio, 

10 aralkylthio, cycloalkylthio, heterocyclothio, alkylthiono, arylthiono, aralkylthiono, 
alkylsulfonyl, arylsulfonyl, aralkylsulfonyl, sulfonamido (e.g. SO2NH2), substituted 
sulfonamido, nitro, cyano, carboxy, carbamyl (e.g. CONH 2 ), substituted caibamyl 
(e.g. CONH alkyl, CONH aryl, CONH aralkyl or cases where there are two 
substituents on the nitrogen selected from alkyl, aryl or aralkyl), alkoxycarbonyl, aryl, 

15 substituted aryl, guanidino and heterocyclos, such as, indolyl, imidazolyl, furyl, 
thienyl, thiazolyl, pyirolidyl, pyridyl, pyrimidyl and the like. Where noted above 
where the substituent is further substituted it will be with halogen, alkyl, alkoxy, aryl 
or aralkyl. 

In accordance with one aspect of the present invention there are provided 
20 isolated polynucleotides that encode epothilone B hydroxylase, an enzyme capable of 
hydroxylating epothilones having a terminal alkyl group to produce epothilones 
having a terminal hydroxyalkyl group. 

In accordance with another aspect of the present invention there are provided 
isolated polynucleotides that encode a ferredoxin, the gene for which is located 
25 downstream from the epothilone B hydroxylase gene. Ferredoxin is a protein of the 
cytochrome P450 system. 

By "polynucleotides", as used herein, it is meant to include any form of DNA 
or RNA such as cDNA or genomic DNA or mRNA, respectively, encoding these 
enzymes or an active fragment thereof which are obtained by cloning or produced 
30 synthetically by well known chemical techniques. DNA may be double- or single- 
stranded. Single-stranded DNA may comprise the coding or sense strand or the non- 
coding or antisense strand. Thus, the term polynucleotide also includes 



-7- 



WO 2004/061116 



PCT/US2003/034082 



polynucleotides exhibiting at least 60% or more, preferably at least 80%, homology to 
sequences disclosed herein, and which hybridize under stringent conditions to the 
above-described polynucleotides. As used herein, the term "stringent conditions" 
means hybridization conditions of 60°C at 2xSSC buffer. More preferred are isolated 
5 nucleic acid molecules capable of hybridizing to the nucleic acid sequence set forth in 
1, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 70, 72, or 74 or SEQ ID 
NO:3, or to the complementary sequence of the nucleic acid sequence set forth in 
SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62 ,64, 66, 68, 70, 72 ,or 74 
or SEQ ID NO:3, under hybridization conditions of 3X SSC at 65°C for 16 hours, 

10 and which are capable of remaining hybridized to the nucleic acid sequence set forth 
in SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 70, 72 or 74 
or SEQ ID NO:3, or to the complementary sequence of the nucleic acid sequence set 
forth in SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41 or 42, 60, 62, 64, 66, 68, 70, 
72 or 74 or SEQ ID NO:3, under wash conditions of 0.5X SSC, 55°C for 30 minutes. 

15 In one embodiment, a polynucleotide of the present invention comprises the 

genomic DNA depicted in SEQ ID NO: 1 or a homologous sequence or fragment 
thereof which encodes a polypeptide having similar activity to that of this epothilone 
B hydroxylase. Alternatively, a polynucleotide of the present invention may comprise 
the genomic DNA depicted in SEQ ID NO:3 or a homologous sequence or fragment 

20 thereof which encodes a polypeptide having similar activity to this ferredoxin. Due to 
the degeneracy of the genetic code, polynucleotides of the present invention may also 
comprise other nucleic acid sequences encoding this enzyme and derivatives, variants 
or active fragments thereof. 

The present invention also relates to variants of these polynucleotides which 

25 may be naturally occurring, Le., present in microorganisms such as Amycolatopsis 
orientalis and Amycolata autotrophica, or in soil or other sources from which nucleic 
acids can be isolated, or mutants prepared by well known mutagenesis techniques. 
Exemplary variants polynucleotides of the present invention are depicted in SEQ ID 
NO: 36-42. 

30 By "mutants** as used herein it is meant to be inclusive of nucleic acid 

sequences with one or more point mutations, or deletions or additions of nucleic acids 
as compared to SEQ ID NO: 1 or 3, but which still encode a polypeptide or fragment 
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with similar activity to the polypeptides encoded by SEQ ID NO: 1 or 3. In a 
preferred embodiment, mutations are made which alter the substrate specificity and/or 
yield of the enzyme. A preferred region of mutation with respect to the epothilone B 
hydroxylase gene is that region of the nucleic acid sequence coding for the 
5 approximately 1 13 amino acids residues comprising the active site of the enzyme. 
Also preferred are mutants encoding a polypeptide with at least one amino acid 
substitution at amino acid position GLU31, ARG67, ARG88, ILE92, ALA93, 
VAL106, HJB130, ALA140, MET176, PHE190, GLU 231, SER294, PHE237, or 
ILE365 of SEQ ED NO:l. Exemplary polynucleotide mutants of the present invention 
10 are depicted in SEQ ID NO: 30, 32, 34, 60, 62, 64, 66, 68, 70, 72 and 74. 

Cloning of the nucleic acid sequence of SEQ ID NO:l encoding epothilone B 
hydroxylase was performed using PCR primers designed by aligning the nucleic acid 
sequences of six cytochrome P450 genes from bacteria The following cytochrome 
P450 genes were aligned: 
15 Sequence 1: Locus: STMSUACB; Accession number M32238; Reference: 
Omer, C.A., J. Bacterid. 172: 3335-3345 (1990) 
Sequence 2: Locus: STMSUBCB; Accession number M32239; Reference: 

Omer, C.A., J. Bacteriol. 172: 3335-3345 (1990) 
Sequence 3: Locus: AB018074 (formerly STMORFA); Accession number 
20 AB018074; Reference: Ueda, KL, J. Antibiot 48: 638-646 (1995) 

Sequence 4: Locus: SSU65940; Accession number: U65940; Reference: 

Motamedi, H., J. Bacteriol. 178: 5243-5248 (1996) 
Sequence 5: Locus: STMOLEP; Accession number L37200; Reference: 

Rodriguez, AM., FEMS Microbiol. Lett 127: 117-120 (1995) 
25 Sequence 6: Locus: SERCP450A; Accession number M83 1 10; Reference: 
Andersen, J.F. and Hutchinson, C.R., J. Bacteriol. 174: 725-735 
(1992) 

Alignments were performed using an implementation of the algorithm of 
Myers, E.W. and W. Miller. 1988. CABIOS 4:1, 11-17., the Align program from 
30 Scientific and Educational Software (Durham, North Carolina, USA). Three highly . 
conserved regions were identified in the I-helix, containing the oxygen binding 
domain, in the K-helix, and spanning the B-bulge and L-helix containing the 
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conserved heme binding domain. Primers were designed to the three conserved 
regions identified in the alignment Primers P450-1* (SEQ ID NO:23) and P450-la + 
(SEQ ID NO:24) were designed from the I helix, Primer P450-2 + (SEQ ID NO:25) 
was designed from the B-Bulge and L-helix region and Primer P450-3"( SEQ ID 
5 NO:27) was designed as the reverse complement to the heme binding protein. 

Genomic fragments were then amplified via polymerase chain reaction (PCR). 
After PCR amplification, the reaction products were separated by gel electrophoresis 
and fragments of the expected size were excised. The DNA was extracted from the 
agarose gel slices using the Qiaquick gel extraction procedure (Qiagen, Santa Clarita, 

10 California, USA). The fragments were then cloned into the PCRscript vector 
(Stratagene, La Jolla, California, USA) using the PCRscript Amp cloning kit 
(Stratagene). Colonies containing inserts were picked to 1-2 ml of LB broth with 100 
ftg/ml ampicillin, 30-37°C, 16-24 hours, 230-300 rpm. Plasmid isolation was 
performed using the Mo Bio miniplasmid prep kit (Mo Bio, Solano Beach, California, 

15 USA). This plasmid DNA was used as a PCR and sequencing template and for 
restriction digest analysis. 

The cloned PCR products were sequenced using the Big-Dye sequencing kit 
from Applied Biosystems, (Foster City, California, USA) and were analyzed using the 
ABB 10 sequencer (Applied Biosystems, Foster City, California, USA). The sequence 

20 of the inserts was used to perform a TblastX search, using the protocol of Altschul, 
SJ 7 , et al. 9 Mol. BioL 215:403-410 (1990), of the non-redundant protein database. 
Unique sequences having a significant similarity to known cytochrome P450 proteins 
were retained. Using this approach, a total of nine different P450 sequences were 
identified from SCI 5847, seven from the genomic DNA template and two from the 

25 cDNA. Two P450 sequences were found in common between the DNA and cDNA 
templates. Of the fifty cDNA clones analyzed, two sequences were predominant, 
with twenty clones each. These two genes were then cloned from the genomic DNA. 

The nucleic acid sequence of the genomic DNA was determined using the 
Big-Dye sequencing system (Applied Biosystems) and analyzed using an ABB 10 

30 sequencer. This sequence is depicted in SEQ ID NO: 1. An open reading frame 

coding for a protein of 404 amino acids and a predicted molecular weight of 44.7 kDa 
was found within the cloned BglH fragment The deduced amino acid sequence of 
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tfciis polypeptide is depicted in SEQ ID NO: 2. The amino acid sequence of this 
polypeptide was found to share 51% identity with the NikF protein of Streptomyces 
tendae (Bruntner, C. et al, 1999, MoL Gen. Genet 262: 102-114) and 48% identity 
with the Sca-2 protein of £ carbophihcs (Watanabe, L Et al, 1995, Gene 163: 81-85). 
5 Both of these enzymes belong to the cytochrome P450 family 105. The invariable 
cysteine found in the heme-binding domain of all cytochrome P450 enzymes is found 
at residue 356. This gene for epothilone B hydroxylase has been named ebh. The 
ATG start codon of a putative f erredoxin gene of 64 amino acids is found nine 
basepairs downstream from the stop codon of ebh. This enzyme was found to share 
10 50% identity with ferredoxin genes of £ griseoulus (O'Keefe, D.P., et al, 1991, 
Biochemistry 30: 447-455) and £ noursei (Brautaset, T., et al, 2000, Chem. Biol. 7: 
395-403). The nucleic acid sequence encoding this ferredoxin is depicted in SEQ ID 
NO:3 and the amino acid sequence for this ferredoxin polypeptide is depicted in SEQ 
IDNO:4. 

15 The ebh gene sequence was also used to isolate variant cytochrome P450 

genes from other microorganisms. Exemplary variant polynucleotides ebh43491, 
ebhU930, ebh53630 9 ebh53550, eWt39444, eWx43333 and eM35165 of the present 
invention and the species from which they were isolated are depicted in Table 1 
below. The nucleic acid sequences for these variants are depicted in SEQ ID NO:36- 

20 42, respectively. 

Table 1: Variant polynucleotides 



ATCCID 


Species 


ebh gene designation 


43491 


Amvcolatopsis orientatis 


<*A43491 


14930 


Amvcolatopsis orientatis 


«6A14930 


53630 


Amvcolatopsis orientatis 


<?Mt53630 


53550 


Amvcolatopsis orientatis 


«6A53550 


39444 


Amvcolatopsis orientatis 


«Mr39444 


43333 


Amvcolatopsis orientatis 


«M»43333 


35165 


Amvcolatopsis orientatis 


«&ft35165 



The amino acid sequences encoded by the exemplary variants ebhA3A9\, 
eWt!4930, eWt53630, eMi53550, ebh39444, e#t43333 and ebh35l65 are depicted in 
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SEQ ID NO:43-49, respectively. Table 2 provides a summary of the amino acid 
substitutions of these exemplary variants. 
Table 2: Amino acid Substitutions 



Position 


ebh 


Substitution 


ebh variant 


100 


Gly 


Ser 


«Wtl4930, eWi43333, «Wi53550, eM43491 


101 


Lys 


Arg 


«Wil4930 


130 


fle 


Leu 


eWil4930 


192 


Ser 


Gin 


eWil4930 


224 


Ser 


Thr 


ebhU930, ebh43333, eM53550, g&ft43491 


285 


He 


Val 


ebhl4930, ebh43333, ebh53550, ebhA3A9l 


69 


Ser 


Asn 


ebh\3333 


256 


Val 


Ala 


ebh43333, ebh53550, eM43491 


93 


Ala 


Ser 


eWi53550 


326 


Asp 


Glu 


«Wi53550, e&A43491 


333 


Thr 


Ala 


ebh53550, e&ft43491 


133 


Leu 


Met 


«Wt43491 


398 


His 


Arg 


«Wt39444 



5 Mutations were also introduced into the coding region of the ebh gene to 

* identify mutants with improved yield, and/or rate of bioconversion and/or altered 
substrate specificity. Exemplary mutant nucleic acid sequences of the present 
invention are depicted in SEQ ID NO:30, 32, 34, 60, 62, 64, 66, 68, 70, 72 and 74. 

The nucleic acid sequence of SEQ ID NO:30 encodes a mutant ebh25-l which 
10 exhibits altered substrate specificity. Plasmid pANT849e67i25-l containing this 
mutant gene was deposited and accepted by an International Depository Authority 
under the provisions of the Budapest Treaty. The deposit was made on November 21, 
2Q02 to the American Type Culture Collection at 10801 University Boulevard in 
Manassas, Virginia 201 10-2209. The ATCC Accession Number is PTA-4809. All 
15 restrictions upon public access to this plasmid will be irrevocably removed upon 
granting of this patent application. The Deposit will be maintained in a public 
depository for a period of thirty years after the date of deposit or five years after the 
last request for a sample or for the enforceable life of the patent, whichever is longer. 
The above-referenced plasmid was viable at the time of the deposit The deposit will 
20 be replaced if viable samples cannot be dispensed by the depository. 

This S. lividans transformant identified in the screening of mutation 25 
(primers NPB29-mut25f (SEQ ED NO:58) and NPB29-mut25r (SEQ ID NO:59)) was 
found to produce a product with a different HPLC elution time than epothilone B or 
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epothilone R A sample of this unknown was analyzed by LC-MS and was found to 
have a molecular weight of 523 (M.W.), consistent with a single hydroxylation of 
epothilone B. Plasmid DNA was isolated from the S. lividans culture and used as a 
template for PCR amplification using primers NPB29-6f (SEQ ID NO:28) and 
5 NPB29-7r (SEQ JD NO:29) (see Example 17). The expected fragment was obtained 
and sequenced using the Big-Dye sequencing system. The ebh25A mutant was found 
to have two mutations resulting in changes in the amino acid sequence of the protein, 
asparagine 195 is changed to serine and serine 294 is changed to proline. The position 
targeted for mutation at codon 238 was found to have a two nucleotide change, which 
10 did not result in a change of the amino acid sequence of the protein. The amino acid 
sequence of the mutanf polypeptide encoded by SEQ ID NO:30 is depicted in SEQ ID 
NO:31. 

The nucleic acid sequence of SEQ ID NO:32 encodes a mutant «?WilO-53, 
which exhibits improved bioconversion yield. This & lividans transformant identified 

15 in the screening of mutation 10 (primers NPB29-mutl0f (SEQ ID NO:54) and 

NPB29-mutl0r (SEQ ID NO:55)) produced a greater yield of epothilone R Plasmid 
DNA was isolated from the & lividans culture and used as a template for PCR 
amplification using primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID 
NO:29)(see Example 16). The expected fragment was obtained and sequenced using 

20 the Big-Dye sequencing system. The ebhlO-53 mutant was found to have two 
mutations resulting in changes in the amino acid sequence of the protein, glutamic 
acid 231 is changed to arginine and phenylalanine 190 is changed to tyrosine. The 
position 231 was the target of the mutagenesis, the change at residue 190 is an 
inadvertent change that is an artifact of the mutagenesis procedure. The amino acid 

25 sequence of the mutant polypeptide encoded by SEQ ID NO:32 is depicted in SEQ ID 
NO:33. 

The nucleic acid sequence of SEQ ID NO:34 encodes a mutant ebK2AA6 y 
which also exhibits improved bioconversion yield. This S. lividans transformant, 
ebti2A-l6 identified in the screening of mutation 24 (primers NPB29-mut24f (SEQ ID 
30 NO:56) and NPB29-mut24r (SEQ ID NO:57) also produced a greater yield of 
epothilone F. Plasmid DNA was isolated from the & lividans culture and used as a 
template for PCR amplification using primers NPB29-6f (SEQ ID NO:28) and 
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NPB29-7r (SEQ ID NO:29). The expected fragment was obtained and sequenced 
using the Big-Dye sequencing system. Hie ebIi2A-l6 mutant was found to have two 
mutations resulting in changes in the amino acid sequence of the protein, 
phenylalanine 237 is changed to alanine and isoleucine 92 is changed to valine. The 
5 position 237 was the target of the mutagenesis, the change at residue 92 is an 

inadvertent change that is an artifact of the mutagenesis procedure. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:34 is depicted in SEQ ID 
NO:35. 

The nucleic acid sequence of SEQ ID NO:60 encodes a mutant e6A24-16d8, 

10 which also exhibits improved bioconversion yield. This S. rimosus transformant, 

ebh2A~l6d& identified in the screening of mutation 59 (primer NPB29mut59 (SEQ ID 
NO:70)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PCR amplification using 
primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ED NO:29). The expected 

15 fragment was obtained and sequenced using the Big-Dye sequencing system. The 
e&A24-16d8 mutant was found to have one mutation resulting in a change in the 
amino acid sequence of the protein, arginine 67 is changed to glutamine. This change 
is an artifact of the mutagenesis procedure. The amino acid sequence of the mutant 
polypeptide encoded by SEQ ID NO:60 is SEQ ID NO:61. 

20 The nucleic acid sequence of SEQ ID NO:62 encodes a mutant ebh2A-l6cll f 

which also exhibits improved bioconversion yield This S. rimosus transformant, 
c&A24-16cll identified in the screening of mutation 59 (primer NPB29mut59 (SEQ 
ID NO:70)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PCR amplification using 

25 primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
eWi24-16cll mutant was found to have two additional mutations resulting in changes 
in the amino acid sequence of the protein, alanine 93 is changed to glycine and 
isoleucine 365 is changed to threonine. The position 93 is the target of the 

30 mutagenesis, the change at 365 is an artifact of the mutagenesis procedure. The 
amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:62 is 
depicted in SEQ ID NO:63. 
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The nucleic acid sequence of SEQ ID NO:64 encodes a mutant ebh2A-16-16, 
which also exhibits improved byconversion yield. This S. rimosus transformant, 
€&A24-16-16 identified in the screening of random mutants of ebh24-16 also 
produced a greater yield of epothilone F. Plasmid DNA was isolated from the S. 
5 rimosus culture and used as a template for PCR amplification using primers NPB29- 
6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was 
obtained and sequenced using the Big-Dye sequencing system. The e&A24-16-16 
mutant was found to have one additional mutation resulting in changes in the amino 
acid sequence of the protein, valine 106 is changed to alanine. The amino add 
10 sequence of the mutant polypeptide encoded by SEQ ID NO:64 is depicted in SEQ ID 
NO:65. 

The nucleic acid sequence of SEQ ID NO:66 encodes a mutant eWj24rl6-74, 
which also exhibits improved bioconversion yield. This S. rimosus transformant, 
<?Wi24-16-74 identified in the screening of random mutants of ebh24-\6 also 

15 produced a greater yield of epothilone F. Plasmid DNA was isolated from the S. 

rimosus culture and used as a template for PCR amplification using primers NPB29- 
6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was 
obtained and sequenced using the Big-Dye sequencing system The <?Wi24~16-74 
mutant was found to have one additional mutation resulting in changes in the amino 

20 acid sequence of the protein, arginine 88 is changed to histidine. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:66 is SEQ ID NO:67. 
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The nucleic acid sequence of SEQ ID NO:68 encodes a mutant eWi24~M18, 
which also exhibits improved bioconversion yield This S. rimosus transformant, 

identified in the screening of random mutants of ebh also produced a 
greater yield of epothilone R Plasmid DNA was isolated from the S. rimosus culture 
5 and used as a template for PCR amplification using primers NPB29-6f (SEQ ID 
NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was obtained and 
sequenced using the Big-Dye sequencing system. The ebhMAS mutant was found to 
have two mutations resulting in changes in the amino acid sequence of the protein, 
glutamic acid 3 1 is changed to lysine and methionine 176 is changed to valine. The 

10 amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:68 is 
depicted in SEQ ID NO:69. 

The nucleic acid sequence of SEQ ID NO:72 encodes a mutant ebh2A-l6g&, 
which also exhibits improved bioconversion yield This S. rimosus transformant, 
e*fc24~16g8 identified in the screening of mutation 50 (primer NPB29mut50 (SEQ ID 

15 NO:71)) also produced a greater yield of epothilone R Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PCR amplification using 
primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
eWi24-16g8 mutant was found to have two additional mutations resulting in changes 

20 in the amino acid sequence of the protein, methionine 176 is changed to alanine and 
isoleucine 130 is changed to threonine. The position 176 is the target of the 
mutagenesis, the change at 130 is an artifact of the mutagenesis procedure. The 
amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:72 is 
depicted in SEQ ID NO:73. 

25 The nucleic acid sequence of SEQ ID NO:74 encodes a mutant eWi24-16b9, 

which also exhibits improved bioconversion yield This S. rimosus transformant, 
e&ft24-16b9 identified in the screening of mutation 50 (primer NPB29mut50 (SEQ ID 
NO:71)) also produced a greater yield of epothilone R Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PCR amplification using 

30 primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system- The 
efc/i24~16b9 mutant was found to have two additional mutations resulting in changes 
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in the amino acid sequence of the protein, methionine 176 is changed to serine and 
alanine 140 is changed to threonine. The position 176 is the target of the mutagenesis, 
the change at 140 is an artifact of the mutagenesis procedure. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:74 is depicted in SEQ ID 
5 NO:75. 

A mixture composed of the plasmids pANT849eWz-24-16, pANT849eWi-10- 
53, P ANT849eMi-24-16d8, pANT849efcft-24-16cl 1, pANT849e&A-24-16-16, 
pant849eWt-24-16-74, pANT849eM-24-16b9, pANT849eWi-M18 and pANT849e&ft- 
24-16g8 for these nine mutant genes was deposited and accepted by an International 

10 Depository Authority under the provisions of the Budapest Treaty. The deposit was 
made on November 21, 2002 to the American Type Culture Collection at 10801 
University Boulevard in Manassas, Virginia 201 10-2209. The ATCC Accession 
Number is PTA-4808. All restrictions upon public access to this mixture of plasmids 
will be irrevocably removed upon granting of this patent application. The deposit will 

15 be maintained in a public depository for aperiod of thirty years after the date of 
deposit or five years after the last request for a sample or for the enforceable life of 
the patent, whichever is longer. The above-referenced mixture of plasmids was viable 
at the time of the deposit The deposit will be replaced if viable samples cannot be 
dispensed by the depository. 

20 Thus, in accordance with another aspect of the present invention, there are 

provided isolated polypeptides of epothilone B hydroxylase and variants and mutants 
thereof and isolated polypeptides of ferredoxin or variants thereof. In one 
embodiment of the present invention, by "polypeptide" it is meant to include the 
amino acid sequence of SEQ ID NO: 2, and fragments or variants, which retain 

25 essentially the same biological activity and/or function as this epothilone B 

hydroxylase. In another embodiment of the present invention, by "polypeptide" it is 
meant to include the amino acid sequence of SEQ ID NO:4, and fragments and/or 
variants, which retain essentially the same biological activity and/or function as this 
ferredoxin. 

30 By "variants" as used herein it is meant to include polypeptides with amino 

acid sequences with conservative amino acid substitutions as compared to SEQ ID 
NO: 2 or 4 which are demonstrated to exhibit similar biological activity and/or 
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function to SEQ ID NO:2 or 4. By "conservative amino acid substitutions" it is 
meant to include replacement, one for another, of the aliphatic amino acids such as 
Ala, Val, Leu and lie, the hydroxyl residues Ser and Thr, the acidic residues Asp and 
Glu, and the amide residues Asn and Gin. Exemplary variant amino acid sequences 
5 of the present invention aie depicted in SEQ ID NO:43-49 and the amino acid 
substitutions of these exemplary variants are described in Table 2, supra. 

By "mutants" as used herein it is meant to include polypeptides encoded by 
nucleic acid sequences with one or more point mutations, or deletions or additions of 
nucleic acids as compared to SEQ ID NO: 1 or 3, but which still have similar activity 

10 to the polypeptides encoded by SEQ ID NO: 1 or 3. la a preferred embodiment, 
mutations are made to the nucleic acid that alter the substrate specificity and/or yield 
from the polypeptide encoded thereby. A preferred region of mutation with respect to 
the epothilone B hydroxylase gene is that region of the nucleic acid sequence coding 
for the approximately 1 13 amino acid residues comprising the active site of the 

15 enzyme. Also preferred are mutants with at least one amino acid substitution at 
amino acid position GLU31, ARG67, ARG88, ILE92, ALA93, VAL106, ILE130, 
ALA140, MET176, PHE190, GLU 231, SER294, PHE237, or JLE365 of SEQ ID 
NO:l Exemplary mutants eM25-l, eWilO-53, ebbQA-16, eM24-16d8, eZ>A24-16cll, 
efcft24-16-16, e&ft24-16-74, *Z>A24-16g8, dWi24-16b9 and the nucleic acid sequences 

20 encoding such mutants of the present invention are depicted in SEQ ID NO:3 1, 33, 
35, 61, 63, 65, 67, 69, 71, 73 and 75, and SEQ ID NO:30, 32, 34, 60, 62, 64, 66, 68, 
70, 72 and 74, respectively. 

A 3-dimensional model of epothilone B hydroxylase has also been constructed 
in accordance with general teachings of Greer et al. (Comparative modeling of 

25 homologous proteins. Methods In Enzymology 202239-52, 1991), Lesk et al. 
(Homology Modeling: Inferences from Tables of Aligned Sequences. Curr. Op. 
Struc. Biol. (2) 242-247, 1992), and Cardozo et al. (Homology modeling by the ICM 
method Proteins 23, 403-14, 1995) on the basis of the known structure of a 
homologous protein EryF (PDB Code IKM chain A). Homology between these 

30 sequences is 34%. Alignment of the sequences of epothilone B hydroxylase (SEQ ID 
NO:2) and EryF (PDB Code 1KDST chain A; SEQ ID NO:76) is depicted in Figure 3. 



-18- 



WO 2004/061116 PCT/US2003/034082 

A homology model of epothilone B hydroxylase based upon sequence alignment with 
EryF is depicted in Figure 4. 

An energy plot of the epothilone B hydroxylase model relative to EryF (PDB 
code 1 JIN) was also prepared and is depicted in Figure 5. An averaging window size 
5 of 51 residues was used at a given residue position to calculate the average of the 
energies of the 5 1 residues in the sequence that lie with the given residue at the central 
position. As shown in Figure 5, all energies along the sequence lie below zero thus 
indicating that the modeled structure as set forth in Figure 4 and Appendix 1 is 
reasonable. 

10 The three-dimensional structure represented in the homology model of 

epothilone B hydroxylase of Figure 4 is defined by a set of structure coordinates as set 
forth in Appendix L The term "structure coordinates" refers to Cartesian coordinates 
generated from the building of a homology model. As will be understood by those of 
skill in the art, however, a set of structure coordinates for a protein is a relative set of 

15 points that define a shape in three dimensions. Thus, it is possible that an entirely 
different set of coordinates could define a similar or identical shape. Moreover, slight 
variations in the individual coordinates, as emanate from generation of similar 
homology models using different alignment templates and/or using different methods 
in generating the homology model, will have minor effects on the overall shape. 

20 Variations in coordinates may also be generated because of mathematical 

manipulations of the structure coordinates. For example, the structure coordinates set 
forth in Appendix 1 could be manipulated by fractionalization of the structure 
coordinates; integer additions or subtractions to sets of the structure coordinates, 
inversion of the structure coordinates or any combination of the above. 

25 Various computational analyses are therefore necessary to determine whether 

a molecule or a portion thereof is sufficiendy similar to all or parts of epothilone B 
hydroxylase described above as to be considered the same. Such analyses may be 
carried out in current software applications, such as SYBYL version 6.7 or 
INSIGHTH (Molecular Simulations Inc., San Diego, CA) version 2000 and as 

30 described in the accompanying User's Guides. 

For example, the superimposition tool in the program SYBYL allows 
comparisons to be made between different structures and different conformations of 
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the same structure. The procedure used in SYBYL to compare structures is divided 
into four steps: 1) load the structures to be compared; 2) define the atom equivalencies 
in these structures; 3) perform a fitting operation; and 4) analyze the results. Each 
structure is identified by a name. One structure is identified as the target (i.e. t the 
5 fixed structure); the second structure (i.e., moving structure) is identified as the source 
structure. Since atom equivalency within SYBYL is defined by user input, for the 
purpose of this aspect of the present invention equivalent atoms are defined as protein 
backbone atoms (N, Co, C and O) for all conserved residues between the two 
structures being compared. Further, only rigid fitting operations are considered . 

10 When a rigid fitting method is used, the working structure is translated and rotated to 
obtain an optimum fit with the target structure. The fitting operation uses an algorithm 
that computes the optimum translation and rotation to be applied to the moving 
structure, such that the root mean square difference of the fit over the specified pairs 
of equivalent atoms is an absolute minimum. This number, given in angstroms, is 

15 reported by SYBYL. 

For the purposes of the present invention, any homology model of epothilone 
B hydroxylase that has a root mean square deviation of conserved residue backbone 
atoms (N, Co, C, O) of less than about 4.0 A when superimposed on the 
corresponding backbone atoms described by structure coordinates listed in Appendix 

20 1 are considered identical. More preferably, the root mean square deviation is less 
than about 3.0 A. More preferably the root mean square deviation is less than about 
2.0 A. 

For the purpose of this invention, any homology model of epothilone B 
hydroxylase that has a root mean square deviation of conserved residue backbone 
25 atoms (N, Ca> C, O) of less than about 2.0 A when superimposed on the 

corresponding backbone atoms described by structure coordinates listed in Appendix 
1 are considered identical. More preferably, the root mean square deviation is less 
than about 1.0 A. 

In another embodiment of the present invention, structural models wherein 
30 backbone atoms have been substituted with other elements which when superimposed 
on the corresponding backbone atoms have low root mean square deviations are 
considered to be identical. For example, an homology model where the original 
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backbone carbon, and/or nitrogen and/or oxygen atoms are replaced with other 
elements having a root mean square deviation of about 4.0 A, more preferably about 
3.0 A, even more preferably less than about 2A, when superimposed on the 
corresponding backbone atoms described by structure coordinates listed in Appendix 
5 1 is considered identical. 

The term "root mean square deviation" means the square root of the arithmetic 
mean of the squares of the deviations from the mean. It is a way to express the 
deviation or variation from a trend or object For purposes of this invention, the "root 
mean square cteviation" defines the variation in the backbone of a protein from the 

10 relevant portion of the backbone of the epothilone B hydroxylase portion of the 
complex as defined by the structure coordinates described herein. 

The present invention as embodied by the homology model enables the 
structure-based design of additional mutants of epothilone B hydroxylase. For 
example, using the homology model of the present invention, residues lying within 

15 lOA of the binding site of epothilone B hydroxylasethave now been defined. These 
residues include LBU39, GLN43, ALA45, MET57, LBU58, HES62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILE80, ASP84, LYS85, PR086, PHE87, ARG88, PR089, SER90, 
LEU91, HJE92, ALA93, MET94, ASP95, HIS99, ARG103, PHE1 10, ILE155, 

20 PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LBU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, ALA242, GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LEU283, 
THR287, ILE288, ALA289, GLU290, THR291, ALA292, THR293, SER294, 

25 ARG295, PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, VAL350, HES351, 
GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392,THR393, ILE394 and TYR395 as set forth in Appendix 
1. Mutants with mutations at one or more of these positions are expected to exhibit 

30 altered biological function and/or specificity and thus comprise another embodiment 
of preferred mutants of the present invention. Another embodiment of preferred 
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mutants are molecules that have a root mean square deviation from the backbone 
atoms of said epothilone B hydroxylase of not more than about 4.0A. 

The structure coordinates of an epothilone B hydroxylase homology model or 
portions thereof are stored in a machine-readable storage medium. Such data may be 
5 used for a variety of purposes, such as drug discovery. 

Accordingly, another aspect of the present invention relates to machine- 
readable data storage medium comprising a data storage material encoded with the 
structure coordinates set forth in Appendix 1. 

The three-dimensional model structure of epothilone B hydroxylase can also 
10 be used to identify modulators of biological function and potential substrates of the 
enzyme. Various methods or combinations thereof can be used to identify such 
modulators. 

For example, a test compound can be modeled that fits spatially into a binding 
site in epothilone B hydroxylase, according to Appendix 1. Structure coordinates of 

15 amino acids within 10 A of the binding region of epothilone B hydroxylase defined by 
amino acids LEU39, GLN43, ALA45, MET57, LEU58, HK62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, HJB80, ASP84, LYS85, PRO86, PHB87, ARG88, PR089, SER90, 
LEU91, 1UB92, ALA93, MET94, ASP95, HE99, ARG103, PHE110, HJB155, 

20 PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237, LBU238, LEU239, LEU240, ELE241, ALA242, GLY243, 
HB244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LEU283, 
THR287, DLE288, ALA289, GLU290, THR291, ALA292, THR293, SER294, 

25 ARG295, PHB296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LBU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392,THR393, BLE394 and TYR395, and the coordinated 
heme group, HEM1 can also be used to identify desirable structural and chemical 

30 features of such modulators. Identified structural or chemical features can then be 
employed to design or select compounds as potential epothilone B hydroxylase 
ligands. By structural and chemical features it is meant to include, but is not limited 
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to, covalent bonding, van der Waals interactions, hydrogen bonding interactions, 
charge interaction, hydrophobic bonding interaction, and dipole interaction. 
Compounds identified as potential epothilone B hydroxylase ligands can then be 
synthesized and screened in an assay characterized by binding of a test compound to 
5 epothilone B hydroxylase, or in characterizing the ability of epothilone B hydroxylase 
to modulate a protease target in the presence of a small molecule. Examples of assays 
useful in screening of potential epothilone B hydroxylase ligands include, but are not 
limited to, screening in silico, in vitro assays and high throughput assays. 

As will be understood by those of skill in the art upon this disclosure, other 

10 structure-based design methods can be used. Various computational structure-based 
design methods have been disclosed in the art For example, a number of computer 
modeling systems are available in which the sequence of epothilone B hydroxylase 
and the epothilone B hydroxylase structure (Le., atomic coordinates of epothilone B 
hydroxylase as provided in Appendix 1 and/or the atomic coordinates within 10A of 

15 the binding region as provided above) can be input This computer system then 
generates the structural details of one or more these regions in which a potential 
epothilone B hydroxylase modulator binds so that complementary structural details of 
the potential modulators can be determined Design in these modeling systems is 
generally based upon the compound being capable of physically and structurally 

20 associating with epothilone B hydroxylase. In addition, the compound must be able 
to assume a conformation that allows it to associate with epothilone B hydroxylase. 
Some modeling systems estimate the potential inhibitory or binding effect of a 
potential epothilone B hydroxylase substrate or modulator prior to actual synthesis 
and testing. 

25 Methods for screening chemical entities or fragments for their ability to 

associate with a given protein target are also well known. Often these methods begin 
by visual inspection of the binding site on the computer screen. Selected fragments or 
chemical entities are then positioned in a binding region of epothilone B hydroxylase. 
Docking is accomplished using software such as INSIGHTH, QUANTA and SYBYL, 

30 following by energy minimization and molecular dynamics with standard molecular 
mechanic force fields such as, MMFF, CHARMM and AMBER. Examples of 
computer programs which assist in the selection of chemical fragment or chemical 
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entities useful in the present invention include, but are not limited to, GRID 

(Goodford, 1985), AUTODOCK (Goodsell, 1990), and DOCK (Kuntz et aL 1982). 

Upon selection of preferred chemical entities or fragments, their relationship 

to each other and epothilone B hydroxylase can be visualized and then assembled into 

5 a single potential modulator. Programs useful in assembling the individual chemical 
i 

entities include, but are not limited to CAVEAT (Bartlett et al. 1989) and 3D 
Database systems (Martin 1992). 

Alternatively, compounds may be designed de novo using either an empty 
active site or optionally including some portion of a known inhibitor. Methods of this 

10 type of design include, but are not limited to LUDI (Bohm 1992) and LeapFrog 
(Tripos Inc., St Louis MO). 

Programs such as DOCK (Kuntz et al. 1982) can be used with the atomic 
coordinates from the homology model to identify potential ligands from databases or 
virtual databases which potentially bind the in the active site binding region which 

15 may therefore be suitable candidates for synthesis and testing. 

Also provided in the present invention are vectors comprising polynucleotides 
of the present invention and host cells which are genetically engineered with vectors 
of the present invention to produce epothilone B hydroxylase or active fragments and 
variants or mutants of this enzyme and/or ferredoxin or active fragments thereof. 

20 Generally, any vector suitable to maintain, propagate or express polynucleotides to 
produce these polypeptides in the host cell may be used for expression in this regard. 
In accordance with this aspect of the invention the vector may be, for example, a 
plasmid vector, a single- or double-stranded phage vector, or a single- or double- 
stranded RN A or DNA viral vector. Vectors may be extra-chromosomal or designed 

25 for integration into the host chromosome. Such vectors include, but are not limited to, 
chromosomal, episomal and vims-derived vectors e.g., vectors derived from bacterial 
plasmids, bacteriophages, yeast episomes, yeast chromosomal elements, and viruses 
such as baculoviruses, papova viruses, SV40, vaccinia viruses, adenoviruses, fowl 
pox viruses, pseudorabies viruses and retroviruses, and vectors derived from 

30 combinations thereof, such as those derived from plasmid and bacteriophage genetic 
elements, cosmids and phagemids. 
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Useful expression vectors for prokaryotic hosts include, but are not limited to, 
bacterial plasmids, such as those from E. coli, Bacillus or Streptomyces, including 
pBluescript, pGEX-2T, pUC vectors, pET vectors, ColEl, pCRl, pBR322, pMB9, 
pCW, pBMS200, pBMS2020, PU101, PU702, pANT849, pOJ260, pOJ446, 
5 pSET152, pKCl 139, pKC1218, pFD666 and their derivatives, wider host range 

plasmids, such as RP4, phage DNAs, e.g., the numerous derivatives of phage lambda, 
e.g., NM989, AXjTIO and A,GT11, and other phages, e.g. y M13 and filamentous single 
stranded phage DNA. 

Vectors of the present invention for use in yeast will typically contain an 

10 origin of replication suitable for use in yeast and a selectable marker that is functional 
in yeast Examples of yeast vectors useful in the present invention include, but are not 
limited to, Yeast Integrating plasmids (e.g., YIp5) and Yeast Replicating plasmids 
(the YRp and YEp series plasmids), Yeast Centromere plasmids (the YCp series 
plasmids), Yeast Artificial Chromosomes (YACs) which are based on yeast linear 

15 plasmids, denoted YLp, pGPD-2, 2n plasmids and derivatives thereof, and improved 
shuttle vectors such as those described in Gietz et al., Gene, 74: 527-34 (1988) 
(YIplac, YEplac and YCplac). 

Mammalian vectors useful for recombinant expression may include a viral 
origin, such as the S V40 origin (for replication in cell lines expressing the large 

20 T-antigen, such as COS 1 and COS7 cells), the papillomavirus origin, or the EB V 
origin for long term episomal replication (for use, e.g., in 293-EBNA cells, which 
constitutively express the EBV EBNA-1 gene product and adenovirus El A). 
Expression in mammalian cells can be achieved using a variety of plasmids, 
including, but not limited to, pSV2, pBC12BI, and p91023, pCDNA vectors as well as 

25 lytic virus vectors (e.g., vaccinia virus, adeno virus, and baculovirus), episomal virus 
vectors (e.g., bovine papillomavirus), and retroviral vectors (e.g., murine 
retroviruses). Useful vectors for insect cells include baculoviral vectors and pVL941. 

Selection of an appropriate promoter to direct mRNA transcription and 
construction of expression vectors are well known. In general, however, expression 

30 constructs will contain sites for transcription initiation and termination, and, in the 
transcribed region, a ribosome binding site for translation. The coding portion of the 
mature transcripts expressed by the constructs will include a translation initiating 



-25- 



WO 2004/061116 



PCT/US2003/034082 



codon at the beginning and a termination codon appropriately positioned at the end of 
the polypeptide to be translated. 

Examples of useful promoters for prokaryotes include, but are not limited to 
phage promoters such as phage lambda pL promoter, the trc promoter, a hybrid 
5 derived from the tip and lac promoters, the bacteriophage T7 promoter, the TAC or 
TRC system, the major operator and promoter regions of phage lambda, the control 
regions of fd coat protein, snpA promoter, melC promoter, ermE* promoter or the 
araBAD operon. Examples of useful promoters for yeast include, but are not limited 
to, the CYC1 promoter, the GAL1 promoter, the GAL10 promoter, ADH1 promoter, 

10 the promoters of the yeast a-mating system, and the GPD promoter. Examples of 
promoters routinely used in mammalian expression vectors include, but are not 
limited to, the CMV immediate early promoter, the HS V thymidine kinase promoter, 
the early and late SV40 promoters, the promoters of retroviral LTRs, such as those of 
the Rous Sarcoma Virus(RSV), and metallothionein promoters, such as the mouse 

15 metallothionein-I promoter. 

Vectors comprising the polynucleotides can be introduced into host cells using 
any number of well known techniques including infection, transduction, transfection, 
transvection and transformation. The polynucleotides may be introduced into a host 
alone or with additional polynucleotides encoding, for example, a selectable marker 

20 or ferredoxin reductase. In a preferred embodiment of the present invention the 
polynucleotide for epothilone B hydroxylase and ferredoxin are introduced into the 
. host cell. Host cells for the various expression constructs are well known, and those 
of skill can routinely select a host cell for expressing the epothilone B hydroxylase 
and/or ferredoxin in accordance with this aspect of the present invention. Examples 

25 of mammalian expression systems useful in the present invention include, but are not 
limited to, the C127, 3T3, CHO, HeLa, human kidney 293 and BHK cell lines, and 
the COS-7 line of monkey kidney fibroblasts. 

Alternatively, as exemplified herein, epothilone B hydroxylase and ferredoxin 
can be expressed recombinantly in microorganisms. 

30 Accordingly, another aspect of the present invention relates to recombinantly 

produced microorganisms which express epothilone B hydroxylase alone or in 
conjunction with the ferredoxin and which are capable of hydroxylating a compound , 



-26- 



WO 2004/061116 



PCT/US2003/034082 



and in particular an epothilone, having a terminal alkyl group to produce ones having 
a terminal hydroxyalkyl group. The recombinantly produced microorganisms are 
produced by transforming cells such as bacterial cells with a plasmid comprising a 
nucleic acid sequence encoding epothilone B hydroxylase. In a preferred 
5 embodiment, the cells are transformed with a plasmid comprising a nucleic acid 
encoding epothilone B hydroxylase or mutants or variants thereof as well as the 
nucleic acid sequence encoding ferredoxin located downstream of the epothilone B 
hydroxylase gene. Examples of microorganisms which can be transformed with these 
plasmids to produce the recombinant microorganisms of the present invention 

10 include, but are not limited, Escherichia coli, Bacillus megaterium, Amycolatopsis 
orientalis, Sorangium cellulosum, Rhodococcus erythropolis, and Streptomyces 
species such as Streptomyces lividans, Streptomyces virgirdae, Streptomyces 
venezuelae, Streptomyces albus, Streptomyces coelicolor, Streptomyces rhnosus and 
Streptomyces griseus. 

15 The recombinantly produced microorganisms of the present invention are 

useful in microbial processes or methods for production of compounds, and in 
particular epothilones, containing a terminal hydroxyalkyl group. In general, the 
hydroxyalkyl-bearing product can be produced hy culfairing the recnmbinantly 
produced microorganism or enzyme derived therefrom, capable of selectively 

20 hydroxylating a terminal carbon or alkyl, in the presence of a suitable substrate in an 
aqueous nutrient medium containing sources of assimilable carbon and nitrogen, 
under submerged aerobic conditions. 

Suitable epothilones employed as substrate for the method of the present 
invention may be any such compound having a terminal carbon or terminal alkyl 

25 group capable of undergoing the enzymatic hydroxylation of the present invention. 
The starting material, or substrate, can be isolated from natural sources, such as 
Sorangium cellulosum, or they can be synthetically formed epothilones. Other 
substrates having a terminal carbon or terminal alkyl group capable of undergoing an 
enzymatic hydroxylation can be employed by the methods herein. For example, 

30 compactin can be used as a substrate, which upon hydroxylation forms the compound 
pravastatin. Methods for hydroxylating compactin to pravastatin via an 
Actinomadura strain are set forth in U.S. Patent 5,942,423 and U.S. Patent 6,274,360. 
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For example, using the recombinant microorganisms of the present invention 
at least one epothilone can be prepared as described in WO 00/39276, U. S. Serial, 
No. 09/468,854, filed December 21, 1999, the text of which is incorporated herein as 
if set forth at length. An epothilone of the following Formula I 
5 HO<:H2-(Ai)n-(Q)m-(A 2 )o-E (I) 

where 

Ai and A 2 are independently selected from the group of optionally substituted 
Ci-C 3 alkyl and alkenyi; 

Q is an optionally substituted ring system containing one to three rings and at 
1 0 least one carbon to carbon double bond in at least one ring; 

n, m, and o are integers selected from the group consisting of zero and 1, 
where at least one of mornorois l;and 

E is an epothilone core; can be prepared. 
This method comprises the steps of contacting at least one epothilone of the following 
15 formula II 

CHa-CAxMQV-CAaVE (II) 
where Ai, Q, A2, E, n, m, and o are defined as above; 
with a recombinantly produced microorganism, or an enzyme derived 
therefrom, which is capable of selectively catalyzing the hydroxylation of formula II, 
20 and effecting said hydroxylation. 

In a preferred embodiment, the starting material is epothilone B. Epothilone B 
can be obtained from the fermentation of Sorangium celluloswn So ce90, as described 
in DE 41 38 042 and WO 93/10121. The strain has been deposited at the Deutsche 
Sammlung von Mikroorganismen (German Collection of Microorganisms) (DSM) 
25 under No. 6773. The process of fermentation is also described in Hofle, G., et aL, 
Angew. Chenu Int. Ed. Engl, Vol 35, No. 13/14, 1567-1569 (1996). Epothilone B can 
also be obtained by chemical means, such as those disclosed by Meng, D., et aL, /. 
Am. Chem. Soc, Vol. 119, No. 42, 10073-10092 (1996); Nicolaou, KL, et al., J. Am. 
Chenu Soc> VoL 119, No. 34, 7974-7991 (1997) and Schinzer, D., et al., Oienu Eur. 
30 /, VoL 5, No. 9, 2483-2491 (1999). 

Growth of the recombinantly produced microorganism selected for use in the 
process may be achieved by one of ordinary skill in the art by the use of appropriate 
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nutrient medium. Appropriate media for the growing of the recombinantly produced 
microorganisms include those that provide nutrients necessary for the growth of 
microbial cells. See, for example, T. Nagodawithana and J. M Wasileski, Chapter 2: 
"Media Design for Industrial Fermentations," Nutritional Requirements of 
5 Commercially Important Microorganism, edited by T. W. Nagodawithana and G. 
Reed, Esteekay Associates, Inc., Milwaukee, WI, 18-45 (1998); T. L. Miller and B. 
W. Churchill, Chapter 10: "Substrates for Large-Scale Fermentations," Manual of 
Industrial Microbiology and Biotechnology, edited by A.L. Demain and N. A. 
Solomon, American Society for Microbiology, Washington, D.C, 122-136 (1986). A 

10 typical medium for growth includes necessary carbon sources, nitrogen sources, and 
trace elements. Inducers may also be added to the medium- The term inducer as used 
herein, includes any compound enhancing formation of the desired enzymatic activity 
within the recombinantly produced microbial cell. Typical inducers as used herein 
may include solvents used to dissolve substrates, such as dimethyl sulfoxide, dimethyl 

15 formamide, dioxane, ethanol and acetone. Further, some substrates, such as 
epothilone B, may also be considered to be inducers. 

Carbon sources may include sugars such as glucose, fructose, galactose, 
maltose, sucrose, mannitol, sorbital, glycerol starch and the like; organic acids such as 
sodium acetate, sodium citrate, and the like; and alcohols such as ethanol, propanol 

20 and the like. Preferred carbon sources include, but are not limited to, glucose, 
fructose, sucrose, glycerol and starch. 

Nitrogen sources may include an N-Z amine A, corn steeped liquor, soybean 
meal, beef extract, yeast extract, tryptone, peptone, cottonseed meal, peanut meal, 
amino acids such as sodium glutamate and the like, sodium nitrate, ammonium sulfate 

25 and the like. 

Trace elements may include magnesium, manganese, calcium, cobalt, nickel, 
iron, sodium and potassium salts. Phosphates may also be added in trace or 
preferably, greater than trace amounts. 

The medium employed for the fermentation may include more than one 
30 carbon or nitrogen source or other nutrient. 

For growth of the recombinantly produced microorganisms and/or 
hydroxylation according to the method of the present invention, the pH of the medium 
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is preferably from about 5 to about 8 and the temperature is from about 14°C to about 
37°C, preferably the temperature is 28°C. The duration of the reaction is 1 to 100 
hours, preferably 8 to 72 hours. 

The medium is incubated for a period of time necessary to complete the 
5 biotransformation as monitored by high performance liquid chromatography (HPLQ. 
Typically, the period of time needed to complete the transformation is twelve to one 
hundred hours and preferably about 72 hours after the addition of the substrate. The 
medium is placed on a rotary shaker (New Brunswick Scientific Innova 5000) 
operating at 150 to 300 rpm and preferably about 250 rpm with a throw of 2 inches. 

10 The hydroxyalkyl-bearing product can be recovered from the fermentation 

broth by conventional means that are commonly used for the recovery of other known 
biologically active substances. Examples of such recovery means include, but are not 
limited to, isolation and purification by extraction with a conventional solvent, such as 
ethyl acetate and the like; by pH adjustment; by treatment with a conventional resin, . 

15 for example, by treatment with an anion or cation exchange resin or a non-ionic 
adsorption resin; by treatment with a conventional adsorbent, for example, by 
distillation, by crystallization; or by recrystallization, and the like. 

The extract obtained above from the biotransformation reaction mixture can be 
further isolated and purified by column chromatography and analytical thin layer 

20 chromatography. 

The ability of a recombinandy produced microorganism of the present 
invention to biotransform an epothilone having a terminal alkyl group to an 
epothilone having a terminal hydroxyalkyl group was demonstrated. In these 
experiments, a culture comprising a Streptomyces lividans clone containing a plasmid 

25 with the ebh gene as described in more detail in Example 11 was incubated with an 
epothilone B suspension for 3 days at 30° with agitation. A sample of the incubate 
was extracted with an equal volume of 25% methanol: 75% n-butanol, vortexed and 
allowed to settle for 5 minutes. Two hundred |nl of the organic phase was transferred 
to an HPLC vial and analyzed by HPLC/MS (Example 12). A product peak of 

30 epothilone F eluted at a retention time of 15.9 minutes and had a protonated molecular 
weight of 524. The epothilone B substrate eluted at 19.0 minutes and had a 
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protonated molecular weight of 508. Hie peak retention times and molecular weights 
were confirmed using known standards. 

Rates of biotransformation of epothilone B by cells expressing ebh were also 
compared to rates of biotransformation by ebh mutants. Cells expressing ebh 

5 comprised a frozen spore preparation of. 5". lividans (pANT849-eWi). Cells 

expressing mutants comprises frozen spore preparations of S. lividans (pANT849- 
eWilO-53) and S. lividans (pANT849-e&ft24-16). A frozen spore preparation of S. 
lividans TK24 was used as the control. The cells were pre-incubated for several days 
at 30°C. Following this pre-incubation, epothilone B in 100% EtOH was added to 

10 each culture to a final concentration of 0.05% weight/volume. Samples were then 
taken at 0, 24, 48 and 72 hours with the exception of the S. lividans (pANT849- 
ebh2A-l6) culture, in which the epothilone B had been completely converted to 
epothilone Fat 48 hours. The samples were analyzed by HPLC. The results are 
calculated as a percentage of the epothilone B at time 0 hours. 

15 

Epothilone B: 



Time (hoars) 


TK24 


pANT849-eWi 


pANT849-eMelO-53 


pANT849^*A24-16 


0 


100% 


100% 


100% 


100% 


24 


99% 


78% 


69% 


56% 


48 


87% 


19% 


39% 


0% 


72 


87% 


0% 


3% 




Epothilone F: 


Time (hours) 


TK24 


P ANT849^W» 


pANT849-eMrtO-53 


P ANT849-e6ft24-16 


0 


0% 


0% 


0% 


0% 


24 


0% 


4% 


9% 


23% 


48 


0% 


21% 


29% 


52% 


72 


0% 


14% 


41% 





20 The ability of cells expressing ebh to biotransf orm compactin to pravastatin 

was also examined. In these experiments, frozen spore preparations of 5. lividans 
(pANT849) or 5. lividans (pANT849-eM) were grown for several days at 30°C 
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Following the pre-incubation, an aliquot of each cell culture was transferred to a 
polypropylene culture tube, compactin was added to each culture tube, and the tubes 
were incubated for 24 hours, 30°C, 250 rpm. An aliquot of the culture broth was then 
extracted and compactin and pravastatin values relative to the control S. lividans 
5 (pANT849) culture were measured via HPLC. 



Compactin and pravastatin as a percentage of starting compactin 
concentration: 





S. lividans (pAtm49) 


S. lividans (pANT849^A) 


Compactin 


36% 


11% 


Pravastatin 


11% 


53% 



As discussed supra, mutant eblOSA (SEQ ID NO:30) exhibits altered 
10 substrate specificity and biotransformation of epothilone B by this mutant resulted in 
a product with a different HPLC elution time than epothilone B or epothilone F. A 
sample of this unknown was analyzed by LC-MS and was found to have a molecular 
weight of 523 (M.W.), consistent with a single hydroxylation of epothilone B. The 
structure of the biotransformation product was determined as 24-hydroxyl-epothilone 
15 B, based on MS and NMR data (compared with data of epothilone B): 

26 




24-hydroxyl-epothilone B 
Formula A 

Molecular Formula: C27H41NO7 S 
20 Molecular Weight 523 

Mass Spectrum: . ES+ (m/z): 524([M+H] 4 ), 506. 

LOMS/MS: +ESI (m/z): 524, 506, 476, 436, 320 

HRMS: Calculated for [M+H] + : 524.2682; Found: 524.2701 
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HPLC (Rt) 73 minutes (on the analytical HPLC system) 

LC/NMR Observed Chemical Shifts 

Varian AS-600 (Proton: 599.624 MHz), 
Solvent D2O/CD3CN (5 1.94): -4/6 
5 Proton: 8730 (s, 1H), 6.43 (s, 1H), 530 (m, 1H), 4.35 (m, 1H), 

3.81 (m, 1H), 3.74 (m, 1H), 3.68 (m, 1H), 3.43 (m, 1H), 2.87 
(m, 1H), 2.66 (s, 3H), 2.40 (m, 2H), 1.58 (b, 1H), 1.48 (b, 1H), 
135 (m, 3H), 1.18 (s, 3H), 1.13 (s, 3H), 0.87 (m, 6H) 
♦Peaks between 1 .8-2. 1 ppm were not observed due to solvent 
10 suppression. 

The proton chemical shift was assigned as follows: 



15 



20 



25 



30 



Position 


Proton 


Pa 


1 






0 
L 


Z.*HJ 


m 


3 


4.35 


m 


4 






5 






6 


3.43 


m 


7 


3.68 


m 


8 


1.58 


m 


9 


1.35 


b 


10 


1.48 


b 


10 


1.35 


b 


11 


SSP 




12 






13 


2.87 


m 


14 


SSP 




15 


5.30 


m 


16 






17 


6.43 


s 


18 
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19 


7.30 


s 


20 






21 


2.66 


s 


22 


1.18 


s 


5 23 


0.87 


m 


24 


3.81 


m 


24 


3.74 


m 


25 


0.87 


m 


26 


1.13 


s 


10 27 


SSP 




*SSP: no observed due to solvent 


suppression. 





Accordingly, the compositions and methods of the present invention are useful 
in producing known compounds that are microtubule-stabilizing agents as well as new 
compounds comprising epothilone analogs such as 24-hydroxyl-epothilone B 

15 (Formula A) and pharmaceutical^ acceptable salts thereof expected to be useful as 
microtubule-stabilizing agents. Hie microtubule stabilizing agents produced using 
these compositions and methods are useful in the treatment of a variety of cancers and 
other proliferative diseases including, but not limited to, the following; 

carcinoma, including that of the bladder, breast, colon, kidney, liver, lung, 

20 ovary, pancreas, stomach, cervix, thyroid and skin; including squamous cell 
carcinoma; 

hematopoietic tumors of lymphoid lineage, including leukemia, acute 
lymphocytic leukemia, acute lymphoblastic leukemia, B-cell lymphoma, T-cell 
lymphoma, Hodgkins lymphoma, non-Hodgkins lymphoma, hairy cell lymphoma and 
25 Burketts lymphoma; 

hematopoietic tumors of myeloid lineage, including acute and chronic 
myelogenous leukemias and promyelocytic leukemia; 

tumors of mesenchymal origin, including fibrosarcoma and 
rhabdomyoscarcoma; 
30 - other tumors, including melanoma, seminoma, tetratocarcinoma, 
neuroblastoma and glioma; 
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tumors of the central and peripheral nervous system, including astrocytoma, 
neuroblastoma, glioma, and schwannomas; 

tumors of mesenchymal origin, including fibrosarcoma, rhabdomyosarcoma, 
and osteosarcoma; and 
5 - other tumors, including melanoma, xenoderma pigmentosum, 

keratoactanthoma, seminoma, thyroid follicular cancer and teratocarcinoma. 

Microtubule stabilizing agents produced using the compositions and methods 
of the present invention will also inhibit angiogenesis, thereby affecting the growth of 
tumors and providing treatment of tumors and tumor-related disorders. Such anti- 

10 angiogenesis properties of these compounds will also be useful in the treatment of 
other conditions responsive to anti-angiogenesis agents including, but not limited to, 
certain forms of blindness related to retinal vascularization, arthritis, especially 
inflammatory arthritis, multiple sclerosis, restinosis and psoriasis. 

Microtubule stabilizing agents produced using the compositions and methods 

15 of the present invention will induce or inhibit apoptosis, a physiological cell death 
process critical for normal development and homeostasis. Alterations of apoptotic 
pathways contribute to the pathogenesis of a variety of human diseases. Compounds 
of the present invention such as those set forth in formula I and II and Formula A, as 
modulators of apoptosis, will be useful in the treatment of a variety of human diseases 

20 with aberrations in apoptosis including, but not limited to, cancer and precancerous 
lesions, immune response related diseases, viral infections, degenerative diseases of 
the musculoskeletal system and kidney disease. 

Without wishing to be bound to any mechanism or morphology, microtubule 
stabilizing agents produced using the compositions and methods of the present 

25 invention may also be used to treat conditions other than cancer or other proliferative 
diseases. Such conditions include, but are not limited to viral infections such as 
herpesvirus, poxvirus, Epstein-Barr virus, Sindbis virus and adenovirus; autoimmune 
diseases such as systemic lupus erythematosus, immune mediated glomerulonephritis, 
rheumatoid arthritis, psoriasis, inflammatory bowel diseases and autoimmune diabetes 

30 mellitus; neurodegenerative disorders such as Alzheimer's disease, ABDS-related 
dementia, Parkinson's disease, amyotrophic lateral sclerosis, retinitis pigmentosa, 
spinal muscular atrophy and cerebellar degeneration; AIDS; myelodysplastic 
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syndromes; aplastic anemia; ischemic injury associated myocardial infarctions; stroke 

and reperfusion injury; restenosis; arrhythmia; atherosclerosis; toxin-induced or 

alcohol induced liver diseases; hematological diseases such as chronic anemia and 

aplastic anemia; degenerative diseases of the musculoskeletal system such as 

5 osteoporosis and arthritis; aspirin-sensitive rhinosinusitis; cystic fibrosis; multiple 

sclerosis; kidney diseases; and cancer pain. 

The following nonlimiting examples are provided to further illustrate the 
present invention. 
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EXAMPLES 

Example 1: Reagents 
R2 Medium was prepared as follows: 
5 A solution containing sucrose (103 grams), K2SO4 (0.25 grains) MgCl 2 «6H 2 0 

(10.12 grams), glucose (10 grams), Difco Casaminoacids (0.1 grams) and distilled 
water (800 ml) was prepared. Eighty ml of this solution was then poured into a 200 
ml screw capped bottle containing 2.2 grams Difco Bacto agar. The bottle was 
capped and autoclavei At time of use, the medium was remelted and the following 
10 autoclaved solutions were added in the order listed: 
lmlKH2PO 4 (0.5%) 
8 ml CaCV2H 2 0 (3.68%) 
1.5 ml I^proline (20%) 
10 ml TES buffer (5.73%, adjusted to pH 7.2) 
15 0.2 ml Trace element solution containing ZnCl 2 (40mg), FeCl 3 *6H 2 O(200 mg), 

CuCl 2 »2H 2 0 (10 mg), MnCl 2 *4H 2 0 (10 mg), Na^CVlOBbO (10 mg), and 

(NH4)6M07024*H 2 0 

0.5 ml NaOH (lN)(sterilization not required) 

0.5 ml Required growth factors for auxotrophs (Histidine (50 (Ig/ml); Cysteine 
20 (37 |ig/ml); adenine, guanine, thymidine and uracil (7.5 fig/ml); and Vitamins (0.5 
Mg/ml). 

R2YE medium was prepared in the same fashion as R2 medium. However, 5 ml of 
Difco yeast extract (10%) was added to each 100 ml flask at time of use. 

25 

P (protoplast) buffer was prepared as follows: 

A basal solution made up of the following was prepared: 
Sucrose (103 grams) 
K 2 S0 4 (0.25 grams) 
30 MgCl 2 »6H 2 0 (2.02 grams) 

Trace Element Solution as described for R2 medium (2 ml) 
Distilled water to 800 ml 
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Eighty ml aliquots of the basal solution were then dispensed and autoclaved. Before 
use, the following was added to each flask in the order listed: 



T (transformation) buffer was prepared by mixing the following sterile solutions: 
25 ml Sucrose (10.3%) 
75 ml distilled water 



The following are then added to 9.3 mis of this solution: 
0.2 ml CaCi 2 (5M) 

0.5 ml Tris maleic acid buffer prepared from 1 M solution of Tris adjusted to 
pH 8.0 by adding maleic acid. 
15 For use, 3 parts by volume of the above solution are added to 1 part by weight of PEG 
1000, previously sterilized by autoclaving. 

L (lysis) buffer was prepared by mixing the following sterile solutions: 



5 



lmlKH 2 PO 4 (0.5%) 

10 ml CaCl 2 *2H 2 0 (3.68%) 

TES buffer (5.75%, adjusted to pH 7.2) 



10 



1 ml Trace Element Solution as described for R2 medium 
lmlK 2 S0 4 (2.5%) 



25 



20 



100 ad Sucrose (10.3%) 

10 ml TES buffer (5.73%, adjusted to pH 7.2) 

lmlK 2 S0 4 (2.5%) 

1 ml Trace Element Solution as described for R2 medium 

lmlKH2PO 4 (0.5%) 

0.1 ml MgCl 2 -6H 2 0 (2.5 M) 

lmlCaCl 2 (0.25M) 
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CRM Medium 

A solution containing the following components was prepared in 1 liter of 
dH 2 0: glucose (10 grams), sucrose (103 grams), MgCl 2 «6H 2 0 (10.12 grams), BBL™ 
trypticase soy broth (15 grams) (Becton Dickinson Microbiology Systems, Sparks, 
5 Maryland, USA), and BBL™ yeast extract (5 grams) (Becton Dickinson 

Microbiology Systems). Hie solution was autoclaved for 30 minutes. Thiostrepton 
was added to a concentration of 10 jig/ml for cultures propagated with plasmids. 

Electroporation Buffer 

10 A solution containing 30% (wt/vol) PEG 1000, 10% glycerol, and 6.5 % 

sucrose was prepared in (IH2O. The solution was sterilized by vacuum filtration 
through a 0.22 pm cellulose acetate filter. 

Example 2: Extraction of Chromosomal DNA from Strain SC15847 

15 Genomic DNA was isolated from an Amycolatopsis orientalis soil isolate 

strain designation SC15847 (ATCC PT-1043) using a guanidine-detergent lysis 
method, DNAzol reagent (Invitrogen, Carlsbad, California, USA). The SC15847 
culture was grown 24 hours at 28°C in F7 medium (glucose 2.2%, yeast extract 1.0%, 
malt extract 1,0 %, peptone 0.1%, pH 7.0). Twenty ml of culture was harvested by 

20 centrifugation and resuspended in 20 ml of DNAzol, mixed by pipetting and 

centrifuged 10 minutes in the Beckman TJ6 centrifuge. Ten ml of 100% ethanol was 
added, inverted several times and stored at room temperature 3 minutes. The DNA 
was spooled on a glass pipette washed in 100% ethanol and allowed to air dry 10 
minutes. The pellet was resuspended in 500 \Jl of 8mM NaOH and once dissolved it 

25 was neutralized with 30 \d of 1M HEPES pH7.2. 

Example 3: PCR Reactions 

PCR reactions were prepared in a volume of 50 fil, containing 200-500 ng of 
genomic DNA or 1.0 pi of the cDNA, a forward and reverse primer, and the forward 
30 primer being either P450-1+ (SEQ ID NO:23) or P450-la + (SEQ ID NO:24) or P450- 
2 + (SEQ ID NO:25) and the reverse primer P450-3" (SEQ ID NO:27)or P450-2(SEQ 
ID NO:26). All primers were added to a final concentration of 1.4- 2.0 jjM. Hie PCR 
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reaction was prepared with 1 pi of Taq enzyme (2.5 units) (Stratagene), 5 |Jl of Taq 
buffer and 4 |il of 2.5 mM of dNTPs with dH 2 0 to 50 pL The cycling reactions were 
performed on a Geneamp® PCR system with the following protocol: 95°C for 5 
minutes, 5 cycles [95°C 30 seconds, 37°C 15 seconds (30% ramp), 72°C 30 seconds], 
5 35 cycles (94°C 30 seconds, 65°C 15 seconds, 72°C 30 seconds), 72°C 7 minutes. 
The expected sizes for the reactions are 340 bp for the P450-l + (SEQ ID NO:23) or 
P450-la + (SEQ ID NO:24) and P450-T (SEQ ID NO:27) primer pairs, 240 bp for the 
P450-1* (SEQ ID NO:23) and P450-2* (SEQID NO:26) primer pairs and 130 bp for 
the P450-2+ (SEQ ID NO:25) and P450-3' (SEQ ID NO:27) primer pairs. 

10 

Example 4: Cloning of Epothilone B Hydroxylase and Ferredoxin Genes 

Twenty fig of SC15847 genomic DNA was digested with BgITT restriction 
enzyme for 6 hours at 37°C. A 30k nanosep column (Gelman Sciences, Ann Arbor, 
Michigan, USA) was used to concentrate the DNA and remove the enzyme and 

15 buffer. The reactions were concentrated to 40 pi and washed with 200 pi of TE. The 
digestion products were then separated a 0.7% agarose gel and genomic DNA in the 
range of 12-15 kb was excised from the gel and purified using the Qiagen gel 
extraction method. The genomic DNA was then ligated to plasmid pWB19N (U.S. 
Patent 5,516,679), which had been digested with BamHI and dephosphorylated using 

20 the SAP I enzyme (Roche Molecular Biochemicals, liidianapolis, Indiana, catalog#l 
758 250). ligation reactions were performed in a 15 fil volume with 1U of T4 DNA 
ligase (Invitrogen) for 1 hour at room temperature. One pi of the ligation was 
transformed to 100 \il of chemically competent DH10B cells (Invitrogen) and 100 \d 
plated to five LB agar plates with 30 fig/ml of neomycin, 37°C overnight 

25 Five nylon membrane circles (Roche Molecular Biochemicals, Indianapolis, 

Indiana) were numbered and marked for orientation. The membranes were placed on 
the plates 2 minutes and then allowed to dry for 5 minutes. The membranes were then 
placed on Whatman filter disks saturated with 10% SDS for 5 minutes, 0.5N NaOH 
with 1.5 M NaCi for 5 minutes, 1.5 M NaCl with 1.0 M Tris pH 8.0 for 5 minutes, 

30 and 15 minutes on 2X SSC. The filters were hybridized as described previously for 
the Southern hybridization. Hybridizing colonies were picked to 2 ml of TB with 30 
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(ig/ml neomycin and grown overnight at 37°C. Plasmid DNA was isolated using a 
miniprep column procedure (Mo Bio). This plasmid was named NPB29-1. 

Example 5: DNA Sequencing and Analysis 
5 The cloned PCR products were sequenced using fluorescent-dye-labeled 

terminator cycle sequencing, Big-Dye sequencing kit (Applied Biosystems, Foster 
city, California, USA) and were analyzed using laser-induced fluorescence capillary 
electrophoresis, ABI Prism 310 sequencer (Applied Biosystems). 

10 Example 6: Extraction of Total RNA 

Total RNA was isolated from the SC15847 culture using a modification of the 
Chomczynski and Sacchi method with a mono-phasic solution of phenol and 
guanidine isothiocyanate, Trizol reagent (Invitrogen). Five ml of an SC15847 frozen 
stock culture was thawed and used to inoculate 100 ml of F7 media in a 500 ml 

15 Erlenmeyer flask. The culture was grown in a shaker incubator at 230 rpm, 30°C for 
20 hours to an optical density at 600 nm (OD600) of 9.0. The culture was placed in a 
16°C shaker incubator at 230 rpm for 20 minutes. Fifty-five milligrams of epothilone 
B was dissolved in 1 ml of 100% ethanol and added to the culture. A second ml of 
ethanol was used to rinse the residual epothilone B from the tube and added to the 

20 culture. The culture was incubated at 16°C, 230 rpm for 30 hours. Thirty ml of the 
culture was transferred to a 50 ml tube, 150 mg of lysozyme was added to the culture 
and the culture was incubated 5 minutes at room temperature. Ten ml of the culture 
was placed in a 50 ml Falcon tube and centrifuged 5 minutes, 4°C in a TJ6 centrifuge. 
Two ml of chloroform was added and the tube was mixed vigorously for 15 seconds. 

25 The tube was incubated 2 minutes at room temperature and centrifuged 10 minutes, 
top speed in the TJ6 centrifuge. The aqueous layer was transferred to a fresh tube and 
2.5 ml of isopropanol was added to precipitate the RNA. The tube was incubated 10 
minutes at room temperature and centrifuged 10 minutes, 4°C. The supernatant was 
removed, the pellet was rinsed with 70% ethanol and dried briefly under vacuum The 

30 pellet was resuspended in 150 jul of RNase-free dH 2 0. Fifty pi of 7.5M Lid was 
added to the RNA and incubated at -20°C for 30 minutes. The RNA was pelleted by 
centrifugation 10 minutes, 4°C in a microcentrifuge. The pellet was rinsed with 200 jil 
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of 70% ethanol, dried briefly under vacuum and resuspended in 150 pi of RNase free 
dH 2 0. 

The RNA was treated with DNasel (Ambion, Austin, Texas, USA). Twenty- 
five pi of total RNA (5.3 fxg/|il), 2.5 jlU of DNasel buffer, 1.0 pi of DNase I added 
5 and incubated at 37°C for 25 minutes. Five pi of DNase I inactivation buffer added, 
incubated 2 minutes, centrifuged 1 minute, the supernatant was transferred to a fresh 
tube. 

Example 7: cDNA Synthesis 

to cDNA was synthesized from the total RNA using the Superscript II enzyme 

(Invitrogen). The reaction was prepared with 1 pi of total RNA (5.3 pg/pl), 9 pi of 
dH 2 0, 1 pi of dNTP mix (10 mM), and 1 pi of random hexamers. The reaction was 
incubated at 65°C for 5 minutes then placed on ice. The following components were 
then added: 4 jxl of 1 st strand buffer, 1 pi of RNase Inhibitor, 2.0 pi of 0.1 M DTT, 

15 and 1 pi of Superscript II enzyme. The reaction was incubated at room temperature 10 
minutes, 42°C for 50 minutes and 70°C for 15 minutes. One pi of RNaseH was added 
and incubated 20 minutes at 37°C, 15 minutes at 70°C and stored at 4°C. 

Example 8: DNA Labeling 

20 The PCR conditions used to amplify the P450 specific products from genomic 

DNA and cDNA were used to amplify the insert of plasmid pCRscript-29. Plasmid 
pCRscript-29 contains a 340bp PCR fragment amplified from SC15847 genomic 
DNA using primers P450 1 + (SEQ ID NO:23) and P450 3" (SEQ ID NO:27). Two pi 
of the plasmid prep was used as a template, with a total of 25 cycles. The amplified 

25 product was gel purified using the Qiaquick gel extraction system (Qiagen). The 

extracted DNA was ethanol precipitated and resuspended in 5 pi of TE, the yield was 
estimated to be 500 ng. This fragment was labeled with digoxigenin using the chem 
link labeling reagent (Roche Molecular Biochemicals, Indianapolis, Indiana catalog 
#1 836 463). Five pi of the PCR product was mixed with 0.5 pi of Dig-chem link and 

30 dH 2 0 added to 20 pi. The reaction was incubated 30 minutes at 85°C and 5 pi of stop 
solution added. The probe concentration was estimated at 20 ng/pl. 
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Example 9: Southern DNA Hybridization 

Ten |xl of genomic DNA (0.5 pg/nl) was digested with BamHI, Bgin, EcoRI, 
Hindin or NotI and separated at 12 volts for 16 hours. The gel was depurinated 10 

5 minutes in 0.25 N HC1 and transferred by vacuum to a nylon membrane (Roche 
Molecular Biochemicals) in 0.4 N NaOH 5" Hg , 90 minutes using a vacuum blotter 
(Bio-Rad Laboratories, Inc. Hercules, California, USA catalog # 165-5000). The 
membrane was rinsed in 1 M ammonium acetate and UV-crosslinked using the 
Stratalinker UV Crosslinker (Stratagene). The membrane was rinsed in 2X SSC and 

10 stored at room temperature. 

The membrane was prehybridized 1 hour at 42°C in 20 ml of Dig Easy Hyb 
buffer (Roche Molecular Biochemicals). The probe was denatured 10 minutes at 65°C 
and then placed on ice. Five ml of probe in Dig-Easy Hyb at an approximate 
concentration on 20 ng/ml was incubated with the membrane at 42°C overnight The 

15 membrane was washed 2 times in 2X SCC with 0. 1 % SDS at room temperature, then 
2 times in 0.5X SSC with 0. 1% SDS at 65°C. The membrane was equilibrated in 
Genius buffer 1 (10 mM maleic acid, 15 mM NaCl; pH 7.5; 0.3% v/v Tween 20) 
(Roche Molecular Biochemicals, Indianapolis, Indiana) for 2 minutes, then incubated 
with 2% blocking solution (2% Blocking reagent in Genius Buffer l)(Roche 

20 Molecular Biochemicals Indianapolis, Indiana) for 1 hour at room temperature. The 
membrane was incubated with a 1:20,000 dilution of anti-dig antibody in 50 ml of 
blocking solution for 30 minutes. Hie membrane was washed 2 times, 15 minutes 
each in 50 ml of Genius buffer 1. The membrane was equilibrated for two minutes in 
Genius Buffer 3 (lOmM Tris-HCl, lOmM NaCl; pH 9.5). One ml of a 1:100 dilution 

25 of CSPD (disodium 3<4-methoxyspiro{ l^-dioxetane-S^HS'- 

chloro)tricyclo[3.3.1. l 3,7 ]decan}-4-yl)phenyl phosphate) (Roche Molecular 
Biochemicals) in Genius buffer 3 was added to the membrane and incubated 5 
minutes at room temperature, then placed at 37°C for 15 minutes. The membrane was 
exposed to Biomax ML film (Kodak, Rochester, New York, USA) for 1 hour. 

30 

Example 10: E. cott Transformation 
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Competent cells were purchased from Invitrogen. E. coli strain DH10B was 
used as a host for genomic cloning. The chemically competent cells were thawed on 
ice and 100 pi aliquoted to a 17 x 100-mm polypropylene tube on ice. One fjQ of the 
ligation mixture was added to the cells and incubated on ice for 30 minutes. Hie cells 
5 were incubated at 42°C for 45 seconds, then placed on ice 1-2 minutes. 0.9 ml pf 
SOC. medium(Invitrogen) was added and the cells were incubated one hour at 30- 
37°C at 200-240 rpm. Cells were plated on a selective medium (Luria agar with 
neomycin or ampicillin at a concentration of 30 fig /ml or 100 fig /ml respectively). 

10 Example 11: Transformation of Streptomyces Uvidans TK24 

Plasmid pWB19N849 was constructed by digesting plasmid pWB19N with 
Hindin and treating with SAP I and digesting plasmid pANT849 (Keiser, et al., 2000, 
Practical Streptomyces Genetics, John Lines ) with HinriTTT. The two linearized 
fragments were ligated 1 hour at room temperature with 1U of T4 DNA ligase. One pi 

15 of the ligation reaction was used to transform XL-1 Blue electrocompetent cells 

(Stratagene). The recovered cells were plated to LB neomycin (30 jlg/ml) overnight at 
37°C. Colonies were picked to 2 ml of LB with 30 pg/ml neomycin and incubated 
overnight at 30°C. MoBio plasmid minipreps were performed on all cultures. 
Plasmids constructed from the ligation of pWB 19N and pANT849 were determined 

20 by electrophoretic mobility on 0.7 % agarose. The plasmid pWB 19N849 was digested 
with Hindin and BgUI to excise a 5.3 kb fragment equivalent to plasmid pANT849 
digested with BgUI and Hindm This 5.3 kb fragment was purified on an agarose gel 
and extracted using the Qiaquick gel extraction system. 

A 1.469 kb DNA fragment containing the epothilone B hydroxylase gene and 

25 the downstream ferredoxin gene was amplified using PCR. The 50 pi PCR reaction 
was composed of 5 pi of Taq buffer, 2.5 pi glycerol, 1 pi of 20 ng/pl NPB29-1 
plasmid, 0.4 fil of 25 mM dNTPs, 1.0 pi each of primers NPB29-6F (SEQ ID NO:28) 
and NPB29-7R (SEQ ID NO:29) (5 pmole/pl), 38.1 pi of dH 2 0 and 0.5 pi of Taq 
enzyme (Stratagene). The reactions were performed on a Perkin Elmer 9700, 95°C for 

30 5 minutes, then 30 cycles (96°C for 30 seconds, 60°C 30 seconds, 72°C for 2 
minutes), and 72°C for 7 minutes. The PCR product was purified using a Qiagen 
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minielute column with the PCR cleanup procedure. The purified product was digested 
with Bglll and HindTTT and purified on a 0.7 % agarose gel. A 1.469 kb band was 
excised from the gel and eluted using a Qiagen minielute column. Five |Lil of this PCR 
product was ligated with 2 fjl of the Bglll, HindDI digested pANT849 vector in a 10 
5 pi ligation reaction. The reaction was incubated at room temperature for 24 hours and 
then transformed to S. lividans TK24 protoplasts. 

Twenty ml of YEMB media was inoculated with a frozen spore suspension of 
S. lividans TK24 and grown 48 hours in a 125 ml bi-indent flask. Protoplasts were 
prepared as described in Practical Streptomyces Genetics. The ligation reaction was 

10 mixed with protoplasts, then 500 [d of transformation buffer was added, followed 
immediately by 5 ml of P buffer. The transformation reactions were spun down 7 
minutes at 2,750 rpm, resuspended in 100 pi of P buffer and plated to one R2YE 
plate. The plate was incubated at 28°C for 20 hours then overlaid with 5 ml of LB 
0.7% agar with 250 (ig/ml thiostrepton. After 7 days colonies were picked to an 

15 R2YE grid plate with 50 fig/ml of thiostrepton. The colonies were grown an 
additional 5 days at 28°C, then stored at 4°C. 

This recombinant microorganism has been deposited with the ATCC and 
designated PTA-4022. 

20 Example 12: Transformation of Streptomyces rimosus 
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The procedure of Pigac and Schrempf AppL Environ Microb., Vol. 61, No. 1, 
352-356 (1995) was used to transform S. rimosus. S. rimosus strain R6 593 was 
cultivated in 20 ml of CRM medium at 30 °C on a rotary shaker (250 rpm). The cells 
wore harvested at 24 hrs by centrifugation for 5 minutes, 5,000 rpm, 4 °C, and 
5 resuspended in 20 ml of 10% sucrose, 4°C, and centrifuged for 5 minutes, 5,000 rpm, 
4 °C. The pellet was resuspended in 10 ml of 15% glycerol, 4 °C and centrifuged for 5 
minutes, 5,000 rpm, 4 °C. The pellet was resuspended in 2 ml of 15% glycerol, 4 °C 
with 100 jig/ml lysozyme and incubated at 37 °C for 30 minutes, centrifuged for 5 
minutes, 5,000 rpm, 4°C and resuspended in 2 ml of 15% glycerol, 4°C The 15% 

10 glycerol wash was repeated once and the pellet was resuspended in 1 to 2 ml of 
Hectroporation Buffer, The cells were stored at -80°C in 50 - 200 jil aliquots. 

The ligations were prepared as described for the S. lividans transformation. 
After the incubation of the ligation reaction, the volume was brought to 100 with 
dH 2 0, NaCl was added to 0.3M, and the reaction extracted with an equal volume of 

15 24: 1 : 1 phenolxhoroform isoamyl alcohol. Twenty fig of glycogen was added and the 
ligated DNA was precipitated with 2 volumes of 100% ethanol at -20 °C for 30 
minutes. The DNA was pelleted 10 minutes in a microcentrifuge, washed once with 
70% ethanol, dried 5 minutes in a speed-vac concentrator and resuspended in 5 pi of 
dH 2 0. 

20 One frozen aliquot of cells was thawed at room temperature and divided, 50 

pi/ tube for each DNA sample for electroporation. The cells were stored on ice until 
use. DNA in 1 to 2 pi of dH 2 0 was added and mixed. The cell and DNA mixture was 
transferred to a 2 mm gapped electrocuvette (Bio-Rad Laboratories, Richmond 
California USA) that was pre-chilled on ice. The cells were electroporated at a setting 

25 of 2 kV (lOkV/cm), 25pF, 400 Q. using a Gene Pulser" (Bio-Rad Laboratories). The 
cells were diluted with 0.75 to 1.0 ml of CRM (04 °C), transferred to 15 ml culture 
tubes and incubated with agitation 3 hrs at 30 °C. The cells were plated on trypticase 
soy broth agar plates with 10-30 pg/ml of thiostrepton. 

30 Example 13: High Performance liquid chromatography 

The liquid chromatography separation was performed using a Waters 2690 
Separation Module system (Waters Corp., Milford, MA, USA) and a column, 4.6 x 
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150 mm, filled with SymmetryShield RP 8 , particle size 3.5 \im (Waters Corp., 
Milford, MA, USA). The gradient mobile phase programming was used with a flow 
rate of L0 ml/minute. Eluent A was water/acetonitrile (20:1) + 10 mM ammonium 
acetate. Eluent B was acetonitrile/water (20:1). The mobile phase was a linear 
5 gradient from 12% B to 28 % B over 6 minutes and held isocratic at 28% B over 4 
minutes. This was followed by a 28% B to 100% B linear gradient over 20 minutes 
and a linear gradient to 12% B over two minutes with a 3 minute hold at 12% B. 

Example 14: Mass spectrometry 

10 The column effluent was introduced directly into the electrospray ion source of a 

ZMD mass spectrometer (Micromass, Manchester, UK). The instrument was calibrated 
using Test Juice reference standard (Waters Corp, Milford, MA, USA) and was delivered 
at a flow of 10 jil/minute from a syringe pump (Harvard Apparatus, Holliston, MA, 
USA). The mass spectrometer was operated at a low mass resolution of 13.2 and a high 

15 mass resolution of 1 1.2. Spectra were acquired from using a scan range of m/z 100 to 
600 at an acquisition rate of 10 spectra /second. The ionization technique employed was 
positive electrospray (ES). The sprayer voltage was kept at 2900 V and the cone voltage 
of the ion source was kept at a potential of 17 V. 

20 Example 15: Use of the ebh gene sequence (SEQ ED NO:l) to isolate cytochrome 
P450 genes from other microorganisms 

Genomic DNA was isolated from a set of cultures (ATCC43491, 
ATCC14930, ATCC53630, ATCC53550, ATCC39444, ATCC43333, ATCC35165) 
using the DNAzol reagent. The DNA was used as a template for PCR reactions using 

25 primers designed to the sequence of the ebh gene. Three sets of primers were used for 
amplification; NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29), NPB29- 
16f (SEQ ID NO:50) and NPB29-17r (SEQ ID NO:51), and NPB29-19f (SEQ ID 
NO:52) and NPB29-20r (SEQ ID NO:53). 

PCR reactions were prepared in a volume of 20 fil, containing 200-500 ng of 

30 genomic DNA and a forward and reverse primer. All primers were added to a final 
concentration of 1.4- 2.0 jjM. The PCR reaction was prepared with 0.2 pi of 
Advantage™ 2 Taq enzyme (BD Biosciences Clontech, Palo Alto, California, USA), 
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2 yd of Advantage™ 2 Taq buffer and 0.2 |il of 2.5 mM of dNTPs with dH 2 0 to 20 
jjI The cycling reactions were performed on a Geneamp® 9700 PCR system or a 
Mastercycler® gradient (Eppendorf , Westbury, New York, USA) with the following 
protocol: 95°C for 5 minutes, 35 cycles (96°C 20 seconds, 54-69°C 30 seconds, 72°C 

5 2 minutes), 72°C for 7 minutes. The expected size of the PCR products is 

approximately 1469 bp for the NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID 
NO:29) primer pair, 1034 bp for the NPB29-16f (SEQ ID NO:50) and NPB29-17r 
(SEQ ID NO:51) primer pair and 1318 bp for the NPB29-19f (SEQ ID NO:52) and 
NPB29-20r (SEQ ID NO:53) primer pair. The PCR reactions were analyzed on 0.7% 

10 agarose gels. PCR products of the expected size were excised from the gel and 
purified using the Qiagen gel extraction method. The purified products were 
sequenced using the Big-Dye sequencing kit and analyzed using an ABB 10 
sequencer. 
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Example 16: Construction of plasmid pPCRscript-eWz 

A 1.469 kb DNA fragment containing the epothilone B hydroxylase gene and 
the downstream ferredoxin gene was amplified using PCR. The 50 (il PCR reaction 
was composed of 5 pi of Taq buffer, 25 pi glycerol, 1 pi of 20 ng/pl NPB29-1 
5 plasmid, 0.4 pi of 25 mM dNTPs, 1.0 pi each of primers NPB29-6f (SBQ ID NO:28) 
and NPB29-7r (SEQ ID NO:29) (5 pmole/pl), 38.1 pi of dH 2 0 and 0.5 pi of Taq 
enzyme (Stratagene). The reactions were performed on a Geneamp® 9700 PCR 
system, with the following conditions; 95°C for 5 minutes, then 30 cycles (96°C for 
30 seconds, 60°C 30 seconds, 72°C for 2 minutes), and 72°C for 7 minutes. The PCR 

10 product was purified using a Qiagen Qiaquick column with the PCR cleanup 

procedure. The purified product was digested with BgUI and Hindlll and purified on a 
0.7 % agarose gel. A 1.469 kb band was excised from the gel and eluted using a 
Qiagen Qiaquick gel extraction procedure. The fragments were then cloned into the 
pPCRscript Amp vector using the PCRscript Amp cloning kit Colonies containing 

15 inserts were picked to 1-2 ml of LB (Luria Broth) with 100 flg/ml ampicillin, 30- 
37°C, 16-24 hours, 230-300 rpm. Plasmid isolation was performed using the Mo Bio 
miniplasmid prep kit The sequence of the insert was confirmed by cycle sequencing 
with the Big-Dye sequencing kit This plasmid was named pPCRscript-eWi. 

20 Example 17: Mutagenesis of the ebh gene for improved yield or altered 
specificity 

The Quikchange® XL Site-Directed Mutagenesis Kit and the Quikchange® 
Multi Site-Directed Mutagenesis kit, both from Stratagene were used to introduce 
mutations in the coding region of the ebh gene. Both of these methods employ DNA 

25 primers 35-45 bases in length containing the desired mutation (SEQ ID NO:54-59 and 
71), a methylated circular plasmid template and PfuTurbo® DNA Polymerase (U.S. 
Patent Nos 5,545,552 and 5,866,395 and 5,948,663) to generate copies of the plasmid 
template incorporating the mutation carried on the mutagenic primers. Subsequent 
digestion of the reaction with the restriction endonuclease enzyme Dpnl, selectively 

30 digests the methylated plasmid template, but leaves the non-methylated mutated 

plasmid intact. The manufacturer's instructions were followed for all procedures with 
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the exception of the Dpnl digestion step in which the incubation time was increased 
from 1 hr to 3 hrs. The pPCRscript-eWi vector was used as the template for 
mutagenesis. 

One to two pi of the reaction was transformed to either XLl-Blue® 
5 electrocompetent or XLIO-Gold® ultracompetent cells (Stratagene). Cells were plated 
to a density of greater than 100 colonies per plate on LA (Luria Agar) 100 Jig/ml 
ampicillin plates, and incubated 24-48 hrs at 30-37°C. The entire plate was 
resuspended in 5 ml of LB containing 100 jig/ml ampicillin. Plasmid was isolated 
directly from the resuspended cells by centrifuging the cells and then purifying the 

10 plasmid using the Mo Bio miniprep procedure. This plasmid was then used as a 
template for PCR with primers NPB29-6f (SEQ ID NO:28)and NPB29-7r (SEQ ID 
NO:29) to amplify a mutated expression cassette. Digestion of the 1.469 Hb PCR 
product with the restriction enzymes BgUI and HindTTT was used to prepare this 
fragment for ligation to vector pANT849 also digested with Bgin and HindTTT. 

15 Alternatively, the resuspended cells were used to inoculate 20- 50 ml of LB 

containing 100 flg/ml ampicillin and grown 18-24 hrs at 30-37°C. Qiagen midi-preps 
were performed on the cultures to isolate plasmid DNA containing the desired 
mutation. Digestion with the restriction enzymes BgUI and HindTT was used to excise 
the mutated expression cassette for ligation to BgUI and HindlTT digested plasmid 

20 pANT849. Screening of mutants was performed in 5. lividans or S. rimosus as 
described. 

Alternatively, the method of Leung et aL, Techniqu e^ A Journal of Methods in 
Cell and Molecular Biology. Vol. 1, No. 1, 11-15 (1989) was used to generate random 
mutation libraries of the ebh gene. Manganese and/or reduced dATP concentration is 

25 used to control the mutagenesis frequency of the Taq polymerase. The plasmid 

pCRscript-£&ft was digested with NotI to linearize the plasmid The Polymerase buffer 
was prepared with 0.166 M (NHO2SO4, 0.67M Tris-HCl pH 8.8, 61 mM Mgd 2 , 67 
jiM EDTA pH8.0, 1.7 mg/ml Bovine Serum Albumin). The PCR reaction was 
prepared with 10 jjl of Not I digested pCRscript-eWi (0.1ng/nl), 10 \sl of polymerase 

30 buffer, 1.0 \sl of 1M P-mercaptoethanol, 10.0 yl of DMSO, 1.0 pi of NPB29-6f (SEQ 
ID NO:28) primer (100 pmole/jil), 1.0 \il of NPB29-7r (SEQ ID NO:29) primer (100 
pmole/pl), 10 pi of 5 mM MnCl 2 , 10.0 Jll 10 mM dGTP, 10.0 fil2mM dATP, 10 mM 
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dTTP, 10.0 fJl 10 mM dCTP, and 2.0 pi Taq polymerase. dH 2 0 was added to 100 (jL 
Reactions were also prepared as described above but without MnCfe. The cycling 
reactions were performed a GeneAmp® PCR system with the following protocol: 
95°C for 1 minute, 25-30 cycles (94 °C for 1 minute, 55 °C for 30 seconds, 72 °C for 4 
5 minutes), 72 °C for 7 minutes. The PCR reactions were separated on an agarose gel 
using a Qiagen spin column. The fragments were then digested with BgLH and HindlH 
and purified using a Qiagen spin column. The purified fragments were then ligated to 
Bglll and Hindin digested pANT849 plasmids. Screening of mutants was performed 
in S. lividans and S. rimosus. 

10 

Table of Characterized Mutants 



Jt AT A. A 

Mutant 


Position 


Substitution 


Wild-type 


ebn2A-\6 


92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 


ebtu5-\ 


ly5 


Serine 


Asparagine 




294 


Proline 


Serine 


eMtlO-53 


190 


Tyrosine 


Phenylalanine 




231 


Axginine 


Glutamic acid 


eM24-16d8 


92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 




67 


Glutamine 


Axginine 


eW2AA6c\\ 


92 


Valine 


Isoleucine 




93 


Glycine 


Alanine 




237 


Alanine 


Phenylalanine 




365 


Threonine 


Isoleucine 


eM24-16-16 


92 


Valine 


Isoleucine 




106 


Alanine 


Valine 




237 


Alanine 


Phenylalanine 


ebk2A-16-74 


88 


Histidine 


Arginine 




92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 


ebh-UlS 


31 


Lysine 


Glutamic acid 




176 


Valine 


Methionine 


«Wt24-16g8 


92 


Valine 


Isoleucine 
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237 
67 
130 
176 



Alanine 
Glutamine 
Threonine 
Alanine 



Phenylalanine 
Arginine 
Isoleucine 
Methionine 



ebh24~l6b9 



92 

237 

67 

140 

176 



Valine 

Alanine 

Glutamine 

Threonine 

Serine 



Isoleucine 

Phenylalanine 

Arginine 

Alanine 

Methionine 



Example 18: Comparison of epothilone B transformation in cells expressing ebh 
and mutants thereof 

5 In these experiments, twenty ml of YEME medium in a 125 ml bi-indented 

flask was inoculated with 200 pi of a frozen spore preparation of S. lividans TK24, S. 
lividans (pANT849-6«i), S. lividans (pANT849-etalO-53) or S. lividans (pANT849- 
ebh24A6) and incubated 48 hours at 230 rpm, 30°C. Thiostrepton, 10 pg/ml was 
added to media inoculated with S. lividans (pANT849-<?Wi), S. lividans (pANT849- 

10 e&ftl0-53) and S. lividans (pANT849-e£/i24-16). Four ml of culture was transferred to 
20 ml of R5 medium in a 125 ml Erlenmeyer flask and incubated 18 hrs at 230 rpm, 
30°C. Epothilone B in 100% EtOH was added to each culture to a final concentration 
of 0.05% weight/volume. Samples were taken at 0, 24, 48 and 72 hours with the 
exception of the S. lividans (pANT849-eMi24-16) culture, in which the epothilone B 

15 had been completely converted to epothilone F at 48 hours. The samples were 
analyzed by HPLC. Results were calculated as a percentage of the epothilone B at 
time 0 hours. 
Epothilone B: 

Time (hours) TK24 pANT849-*Wi pANT849^M10-53 pANT849-eWi24-16 

0 100% 100% 100% 100% 

24 99% 78% 69% 56% 

48 87% 19% 39% 0% 

72 87% 0% 3% — 



20 Epothilone F: 

Time (hours) TK24 pANT849-«Wt pANT849-*M10-53 pANT849^24-16 
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0 0% 0% 0% 0% 

24 0% 4% 9% 23% 

48 0% 21% 29% 52% 

72 0% 14% 41% — 



Alternatively, the bioconversion of epothilone B to epothilone F was 
performed in S. rimosus host cells transformed with expression plasmids containing 
the ebh gene and its variants or mutants. One-hundred |il of a frozen S. rimosus 

5 transformant culture was inoculated to 20 ml CRM media with 10 pg/ml thiostrepton 
and cultivated 16-24 hr, 30°C, 230- 300 rpnL Epothilone B in 100% ethanol was 
added to each culture to a final concentration of 0.05% weight/volume. The reaction 
was typically incubated 20- 40hrs at 30 °C, 230-300 rpm. The concentration of 
epothilones B and F was determined by HPLC analysis. 

10 Evaluation of mutants in S. rimosus 



Mutant 


Epothilone F yield 


ebh-MIS 


55% 


ebhlA-\66& 


75% 


eta24-16cll 


75% 


ebh24-16-16 


75% 


efc&24-16-74 


75% 


ebh2A-16b9 


80% 


ebh2A-16gS 


85% 



Example 19: Biotransformation of compactin to pravastatin 

Twenty ml of R2YE media with 10 pg/ml thiostrepton in a 125 ml flask was 
inoculated with 200 pi of a frozen spore preparation of S. lividans (pANT849), £ 

15 lividans (pANT849-e&A) and incubated 72 hours at 230 rpm, 28°C. Four ml of culture 
was inoculated to 20 ml of R2YE media and grown 24 hours at 230 rpm, 28°C. One 
ml of culture was transferred to a 15 ml polypropylene culture tube, 10 pi of 
compactin (40 mg/ml) was added to each culture and incubated for 24 hours, 28°C, 
250 rpm. Five hundred pi of the culture broth was transferred to a fresh 15 ml 

20 polypropylene culture tube. Five hundred pi of 50 mM sodium hydroxide was added 
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and vortexed. Three ml of methanol was added and vortexed, the tube was 
centrifiiged 10 minutes at 3000 rpm in a TJ-6 table-top centrifuge. The organic phase 
was analyzed by HPLC. Compactin and pravastatin values were assessed relative to 
the control S. lividans (pANT849) culture. 



5 

Compactin and Pravastatin as a Percentage of Starting Compactin 
Concentration: 





S. lividans (pANT849) 


S. lividans (pANTU9-ebh) 


Compactin 


36% 


11% 


Pravastatin 


11% 


53% 



Example 20: High performance liquid chromatography method for compactin and 

10 pravastatin detection 

The liquid chromatography separation was performed using a Hewlett 
Packardl090 Series Separation system (Agilent Technologies, Palo Alto, California, 
USA) and a column, 50x46 mm, filled with Spherisorb ODS2, particle size 5 \im 
(Keystone Scientific, Ihc, Bellefonte, Pennsylvania, USA). The gradient mobile 

15 phase programming was used with a flow rate of 2.0 ml/minute. Eluent A was water, 
10 mM ammonium acetate and 0.05% Phosphoric Acid. Eluent B was acetonitrile. 
The mobile phase was a linear gradient from 20% B to 90 % B over 4 minutes. 

Example 21: Structure determination of the biotransformation product of 
20 mutant ebhlS-l 

Analytical HPLC was performed using a Hewlett Packard 1100 Series liquid 
Chromatograph with a YMC Packed ODS-AQ column, 4.6 mm i.d. x 15 cm L A 
gradient system of water (solvent A) and acetonitrile (solvent B) was used: 20% to 
90% B linear gradient, 10 minutes; 90% to 20% linear gradient, 2 minutes. The flow 
25 rate was 1 ml/minute and UV detection was at 254 nm. 

Preparative HPLC was performed using the following equipment and 
conditions: 

Pump: Varian ProStar Solvent Delivery Module (Varian Inc., Palo Alto, California, 
USA). Detector: Gynkotek UVD340S. 
30 Column: YMC ODS-A column (30mmID X 100 mm length, 5fi particle size). 
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Elution flow rate: 30 ml/minute 

Elation gradient: (solvent A: water; solvent B: acetonitrile), 20% B, 2 minutes; 20% 
to 60% B linear gradient, 18 minutes; 60% B, 2 minutes; 60 % to 90% B linear 
gradient, 1 minute; 90 % B, 3 minutes; 90 % to 20% B linear gradient, 2 minutes. 
5 Detection: UV, 210 nm. 

LC/NMR was performed as follows: 40 (il of sample was injected onto a 
YMC Packed ODS-AQ column (4.6 mm i.d. x 15 cm 1). The column was eluted at 1 
ml/minute flow rate with a gradient system of D2O (solvent A) and acetonitrile-d3 
(solvent B): 30% B, 1 minute; 30% to 80% B linear gradient, 11 minutes. The eluent 

10 passed a UV detection cell (monitored at 254 nm) before flowing through a F19/H1 
NMR probe (60 fil active volume) in Varian AS-600 NMR spectrometer. The 
biotransformation product was eluted at around 7.5 minutes and the flow was stopped 
manually to allow the eluent to remain in the NMR probe for NMR data acquisition. 
Isolation and analysis was performed as follows. The butanol/methanol extract 

15 (about 10 ml) was evaporated to dryness under nitrogen stream. One ml methanol 
was added to the residue (38 mg) and insoluble material was removed by 
centrifogation (13000 rpm, 2 min). 0.1 ml of the supernatant was used for LC/NMR 
study and the rest of 0.9 ml was subjected to the preparative HPLC (0.2-0.4 ml per 
injection). Two major peaks were observed and collected: peak A was eluted between 

20 14 and 15 minutes, while peak B was eluted between 16.5 and 17.5 minutes. 

Analytical HPLC analysis indicated that peak B was the parent compound, epothilone 
B (Rt 8.5 minutes), and peak A was the biotransformation product (Rt 7.3 minutes). 
The peak A fractions were pooled and MS analysis data was obtained with the pooled 
fractions. The pooled fraction was evaporated to a small volume, then was 

25 lyophilized to give 3 mg of white solid NMR and HPLC analysis of the white solid 
(dissolved in methanol) revealed that the biotransformation product was partially 
decomposed during the drying process. 
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APPENDIX 1 



Atom No. 


Residue 


Atom Name 


X-coord 


Y-coord 


Z-coord 


1 


ALA9 


N 


-39.918 


-4.913 


-1.651 


2 


ALA9 


CA 


-38.454 


-5.033 


-1.537 


3 


ALA9 


c 


-37.953 


-4.886 


-0.099 


4 


ALA9 


0 


-38.625 


-4.31 


0.765 


5 


ALA9 


CB 


-37.809 


-3.967 


-2.415 


6 


THR10 


N 


-36.781 


-5.447 


0.146 


7 


THR10 


CA 


-36.187 


-5.437 


1.49 


8 


THR10 


c 


-34.916 


-4.585 


1.553 


9 


THR10 


o 


-34.016 


-4.735 


0.72 


10 


THR10 


CB 


-35.871 


-6.887 


1.846 


11 


THR10 


OG1 


-37.075 


-7.631 


1.717 


12 


THR10 


CG2 


-35.355 


-7.053 


3.271 


13 

1 o 


LEU11 


N 


-34.858 


-3.699 


2.536 


14 


LEU11 


CA 


-33.669 


-2.853 


2.745 


15 


LEU 11 


c 


-32.511 


-3.649 


3.353 


16 


| Fl H \ 


o 


-32.706 


-4.468 


4.259 


17 
-i t 


LEU11 


CB 


-34.033 


-1.707 


3.687 


1ft 




CG 


-35.079 


-0.78 


3.078 


19 


LFU11 

l— l — w 1 1 


CD1 


-35.53 


0265 


4.091 


po 


LEU11 


CD2 


-34.555 


-0.111 


1.81 


P1 


PR0 12 


N 


-31.32 


-3.422 


2.823 


pp 


PR012 


CA 


-30.121 


-4.119 


3.302 


23 


PR012 


c 


-29.652 


-3.606 


4.663 


P4 


PR012 


o 


-29.656 


-2.397 


4.918 


25 


PR012 


CB 


-29.081 


-3.842 


2259 


26 


PR012 


CG 


-29.597 


-2.771 


1.309 


27 


PR012 


CD 


-31.031 


-2.493 


1.729 


28 


LEU13 


N 


-29.278 


-4.522 


5.54 


29 


LEU13 


CA 


-28.676 


-4.118 


6.819 


30 


LEU13 


C 


-27.183 


-3.88 


6.627 


31 


LEU13 


O 


-26.449 


-4.806 


6267 


32 


LEU13 


CB 


-28.898 


-5.196 


7.872 


33 


LEU13 


CG 


-30.374 


-5.354 


8.217 


34 


LEU13 


CD1 


-30.587 


-6.516 


9.181 


35 


LEU13 


CD2 


-30.945 


-4.067 


8.802 


36 


ALA14 


N 


-26.72 


-2.741 


7.112 


37 


ALA14 


CA 


-25.355 


-2.266 


6.825 


38 


ALA14 


C 


-24244 


-2.941 


7.634 


39 


ALA14 


O 


-23.058 


-2.719 


7.372 


40 


ALA14 


CB 


-25.311 


-0.764 


7.075 


41 


ARG15 


N 


-24.628 


-3.792 


8.569 


42 


ARG15 


CA 


-23.664 


-4.537 


9.379 


43 


ARG15 


C 


-23.476 


-5.983 


8.91 


44 


ARG15 


O 


-22.815 


-6.767 


9.599 


45 


ARG15 


CB 


-24.174 


-4.519 


10.81 


46 


ARG15 


CG 


-25.655 


-4.879 


10.84 


47 


ARG15 


CD 


-262 


-4.843 


1226 


48 


ARG15 


NE 


-27.657 


-5.039 


12256 


49 


ARG15 


cz 


-28.358 


-5.301 


13.36 


50 


ARG15 


NH1 


-29.69 


-5.376 


13.3 


51 


ARG15 


NH2 


-27.735 


-5.412 


14.536 


52 


LYS16 


N 


-24.096 


-6.351 


7.798 
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53 


LYS16 


CA 


-24.016 


-7.741 


7 ftftK 


54 


LYS16 


c 


-22 639 


-8.128 


R ft07 


55 


LYS16 


o 


-21.959 


-7.359 


6 11S 


56 


LYS16 


CB 


-25 061 


-7.977 


fi 9^9 


57 


LYS16 


CG 


-26.466 


-7.985 




58 


LYS16 


CD 


-26 605 


-9.079 


7 AQ? 

f .v)57 c. 


59 


LYS16 


CE 


-28 002 


-9.092 


ft 4QQ 


60 


LYS16 




-9ft 11ft 


-10.128 


57.00/ 


61 


CYS17 


N 

1 M 


r f..O 1 / 


-ft ftQ9 


7 Oftfi 


62 


CYS17 


CA 


_pi nft-i 

1 iW 1 


-10 004. 


0.00 


63 


CYS17 

10 1/ 


p 




_Q 77-1 
-y. / / 1 


O.UOO 


64 


CYS17 


o 

u 


-1Q fifi9 

- I kJ.\JOc. 


-Q 90^ 


*f.OOO 




CYS17 


PR 


-91 oqa 


-11 

- 1 1 .ou 1 


ft ftftA 
O.OOH- 


OV7 


vIOli 




-91 ftft 
1.00 


-11 Qft7 
- 1 1 . yo / 


ft ftfi9 


fi7 


PROlft 


M 

IN 


•91 ftft*\ 


.1 0 OOft 
- 1 u.uuo 


1 


DO 


PRO 1ft 
rnu 1 o 


PA 




-57./ 00 


^Uooo. 




PROlft 
rnu io 


P 


91 IOQ 


0 OQ1 


O Oy!A 




PR01 ft 
rnu io 


n 


-91 oiq 


-o.uo I 


1 no/5 
I.UoO 


71 
/ i 


PROlft 
rnu i o 




.99 ftftft 


in ofto 
-1 u.ooo 


1 C7D 

1.0/0 


79 


PROlft 




-^o.Ouy 


in Q10 
-lU.ol^ 


0 ono 


7Q 
/ O 


PROlft 
rnUlo 


pn 




-1 U.004 


4^07 


74 


PUP1Q 


KJ 


Oi -4 07 


7 00 
-/.00 


O -ICO 


7*\ 
/O 


PHP1Q 


PA 


OA 700 


-0.y4/ 


O DO/1 


7fi 
/O 


rnC 1 y 




-iyus/y 


c 7-7-7 
-O. / / / 


O 7QQ 
/I./OO 


77 

/ /. 


PHP1Q 

rnc iy 


u 


IO 7QQ 

-1 o./oy 


-4.y*i 


0 noc 


7ft 
/ o 


PHP1Q 
rnc i y 


PR 


-icl .00 


-O.UUY 


o.oy4 


7Q 


PWP1Q 
rnc i y 




99 ft 


>f ceo 
-4.000 


Q CCA 
O.004 


on 
ou 


PHP1Q 
rnc i a 


prvi 

Uu I 


-<£o.UOl 




Q OOO 


81 

O 1 


rnu 1 57 




-9ft ft^fi 
- t O.OOO 


-0.444 


ft ftft7 
0.00/ 


82 


PHP1Q 


CP1 


-t'KOOO 


-9 R^ft 


ft Oftft 


83 


11 IU 1 57 


OP? 


.OC -ICQ 

**^o. toy 


Oft 
-o.uo 


ft R9Q 

o.o^.y 


84 


PHE19 


cz 

V//U 


-9S 4.HQ 


-ft 7ft 1 ? 


ft 1Q7 

O. 1 £7/ 


85 


SER20 


N 


-1ft R7ft 


-fi fift7 
-0.00/ 


ft &AQ 


86 


SER20 


CA 


-17 102 


-6.717 


ft 446 


87 


SER20 


c 


-16.569 


-7.839 


4.342 


88 


SER20 


0 


-16.632 


-7.723 


5.573 


89 


SER20 


CB 


-16.557 


-5.371 


3.929 


90 


SER20 


OG 


-17.236 


-5.019 


5.129 


91 


PR021 


N 


-15.974 


-8.867 


3.753 


92 


PR021 


CA 


-15.978 


-9.134 


2.304 


93 


PR021 


c 


-17.267 


-9.836 


1.856 


94 


PR021 


O 


-18.026 


-10.327 


2.702 


95 


PR021 


CB 


-14.8 


-10.047 


2.111 


96 


PR021 


CG 


-14.442 


-10.669 


3.455 


97 


PR021 


CD 


-15.306 


-9.949 


4.481 


98 


PR022 


N 


-17.551 


-9.859 


0.561 


99 


PR022 


CA 


-16.897 


-9.007 


-0.445 


100 


PR022 


C 


-17.4 


-7.575 


-0.296 


101 


PR022 


0 


-18.341 


-7.371 


0.469 


102 


PR022 


CB 


-17.32 


-9.591 


-1.762 


103 


PR022 


CG 


-18.478 


-10.549 


-1.528 


104 


PR022 


CD 


-18.669 


-10.604 


-0.021 


105 


PR023 


N 


-16.687 


-6.605 


-0.842 


106 


PR023 


CA 


-17.224 


-5.241 


-0.897 
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AfX7 

10/ 








c OA 

-O.d. 1 


-1 .D9o 


■i no 
lUo 


DDAOO 


r\ 
U 


-1 fl KOA 


-O.Uoo 


o ooc 


Af\a 


DDAIO 

PHU2o 


Ob 


1A 1 

- lo.ioy 


A AAT 

-4.41/ 


•4 CAT 
-1 .54/ 


A A f\ 

110 


DDAOO 


CG 


1 c aa>i 


c oo^ 


-1.95 


111 


DDAOO 


ST\ 

UU 


•lo.ooo 


-o. /2o 


A CAD 

-1 .509 


A A O 

112 


CaLU24 


N 


■aq /;o 

-i y.b2 


• K HOO 


-0.9OD 


A A O 

113 


GLU24 


CA 


OA QCO 


-5.192 


■« CAT 

-1 .54/ 


AAA 

114 


I in a 

GLU24 


C 


01 AAC 

-21 ,41 5 


O QAO 

-o.o4o 


-2.088 


115 


GLU24 


o 


oo ooo 

-22.323 


-3.794 


-2.93 


116 


GLU24 


An 

CB 


-21 ,934 


-5.68 


-0.48 


117 


GLU24 


CG 


-23.27 


-6.137 


-1.052 


118 


GLU24 


CD 


-23.982 


-7.017 


-0.024 


119 


GLU24 


OE1 


-24.613 


-7.981 


-0.433 


120 


GLU24 


OE2 


-23.833 


-6.745 


1.158 


121 


TYR25 


N 


-20.573 


-2.843 


-1 .878 


122 


TYR25 


CA 


-20.842 


-1.47 


-2.303 


123 


TYR25 


C 


-20.704 


-1.311 


-3.816 


124 


TYR25 


Q 


-21.364 


-0.436 


-4.385 


125 


TYR25 


CB 


-19.828 


-0.568 


-1.608 


126 


TYR25 


CG 


-19.616 


■0.882 


-0.128 


127 


TYR25 


CD1 


-20.662 


-0.753 


0.779 


128 


TYR25 


CD2 


-18.364 


-1.298 


0.311 


129 


TYR25 


CE1 


-20.461 


-1.062 


2.119 


130 


TYR25 


CE2 


-18.163 


-1.605 


1.65 


131 


TYR25 


CZ 


-19.213 


-1.492 


2.55 


132 


TYR25 


OH 


-19.026 


-1.859 


3.866 


133 


GLU26 


N 


-20.1 


-2J>96 


-4.468 


134 


GLU26 


CA 


AA AAA 

-20.009 


-2.293 


-5.928 


135 


GLU26 


C 


-21.404 


-2.483 


-6.52 


136 


GLU26 


O 


-21 .92 


-1 .572 


T A T7 

-7.177 


137 


GLU26 


CB 


-1 9.129 


O AC A 

-3.454 


-6.39 


138 


GLU26 


CG 


A ~f OA O 

-17.813 


-3.593 


-5.628 


A on 

139 


GLU26 


GU 


A O C\A 

-16.94 


O OifO 

-2.o42 


-O.707 


A Af\ 

140 


r>i i too 

vaLU2b 


Uhl 


A C O A C 


o -to 
-id. 1*1 


-0./49 


A AA 

141 


/"M 1 IOC 

vaLU2b 


A CO 


A C T70 

-lb. / /O 


-1 . / ol 


-4.DO/ 


A AO 


Ariva/i/ 


M 

In 


OO 1AR 

-^:.lUo 


q AP.R 


-O.UI / 


H AO. 

14o 


Arua2/ 


OA 


-2.0.40/ 


o one 


-o.Ooo 


i. AA 

144 


A 0/^07 

Arlva^/ 


/*■» 
o 


OA tzr\A 


O QAQ 


-o.y^ i 


A AC 

14Q 


ADA07 




OCT AQft 


JO CQi 


-o.oy 


1 Apt 

I HO 


ARf597 




_oq 7co 


°fi 




147 
14/ 


ARR97 
Mrldi£/ 




joo 7 


-fi 1ftQ 

-O. IOi7 




A AO 
IHO 


ARC5°7 


CD 


_pq 09,1 


-7 fif53 


-6 55 


14Q 


ARG27 


NF 
lit. 


_pq 14.fi 


-7 926 


-5.108 


1 ou 


nnut./ 


C7 


_pp OC1 


-ft fi4ft 


-4 4Pft 


1R1 
ID I 


ARR97 


NH1 


-91 1fi 


-Q 11 

57. 1 1 




152 


ARG27 


NH2 


-22.428 


-8.879 


-3.126 


153 


LEU28 


N 


-24.197 


-2.331 


-4.771 


154 


LEU28 


CA 


-25.11 


-1.358 


-4.168 


155 


LEU28 


C 


-25.131 


-0.079 


-4.987 


156 


LEU28 


O 


-26.214 


0.286 


-5.45 


157 


LEU28 


CB 
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What is Claimed is ; 

1. An isolated nucleic acid sequence encoding epothilone B hydroxylase 
or a mutant or variant thereof . 

5 

2. The isolated nucleic acid sequence of claim 1 comprising SEQ ED NO: 
1, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 72 or 74. 

3. The isolated nucleic acid sequence of claim 1 comprising SEQ ID 

10 NO:l. 

4. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution in an active site of the epothilone B hydroxylase 
enzyme. • 

15 

5. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution at amino acid GLU31, ARG67, ARG88, ILE92, 
ALA93, VAL106, BLE130, ALA140, MET176, PHE190, GLU 231, SER294, 
PHE237, or ILB365 of SEQ ID NO:2. 

20 

6. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution at amino acid LEU39, GLN43, ALA45, MET57, 
LEU58, HIS62, PHE63, SER64, SER65, ASP66, ARG67, GLN68, SER69, LEU74, 
MET75, VAL76, ALA77, ARG78, GLN79, ILE80, ASP84, LYS85, PR086, PHE87, 

25 ARG88, PR089, SER90, LEU91, ILB92, ALA93, MET94, ASP95, HIS99, ARG103, 
PHE1 10, ELE155, PHE169, GLN170, CYS172, SER173, SER174, ARG175, 
MET176, LEU177, SER178, ARG179, ARG186, PHE190, LEU193, VAL233, 
GLY234, LEU235, ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, 
ALA242, GLY243, HB244, GLU245, THR246, THR247, ALA248, ASN249, 

30 MET250, LEU283, THR287, HJB288, ALA289, GLU290, THR291, ALA292, 
THR293, SER294, ARG295, PHE296, AIA297, THR298, GLU312, GLY313, 
VAL314, VAL315, GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, 
VAL350, HIS351, GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, 
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ALA359, GLU362, LYS389, ASP391, SER392, THR393, DLE394, or TYR395 of 
SEQIDNO:2. 

7. The isolated nucleic acid sequence of claim 1 encoding a variant 
5 comprising SEQ ID NO:43, 44, 45, 46, 47, 48 or 49. 

8. A polypeptide encoded by the isolated nucleic acid sequence of claim 

1. 

10 9. An isolated nucleic acid molecule that is capable of hybridizing to a 

nucleic acid sequence of claim 2, or to the complementary sequence of said nucleic 
acid sequence, under hybridization conditions of 3X SSC at 65°C for 16 hours, said 
isolated nucleic acid molecule being capable of remaining hybridized to said nucleic 
acid sequence, or to the complementary sequence of said nucleic acid sequence, under 

15 wash conditions of 0.5X SSC, 55°C for 30 minutes. 

10. An isolated polypeptide comprising SEQ ID NO:2. 

20 1 1. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 

ID NO:2 comprising an amino acid sequence with at least one amino acid substitution 
in an active site of epothilone B hydroxylase enzyme of SEQ ID NO:2. 

12. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 
25 ID NO:2 comprising an amino acid sequence with at least one amino acid substitution 
at amino acid GLU31, ARG67, ARG88, ILE92, ALA93, VAL106, ILE130, 
ALAMO, MET176, PHE190, GLU 231, SER294, PHE237, or ILE365 of SEQ ID 
NO:2. 

30 13. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 

ID NO:2 comprising an amino acid sequence with at least one amino acid substitution 
at amino acid LEU39, GLN43, ALA45, MET57, LEU58, HIS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILE80, ASP84, LYS85, PR086, PHE87, ARG88, PR089, SER90, 

35 LEU91, ILE92, ALA93, MET94, ASP95, HIS99, ARG103, PHE1 10, DLE155, 
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PHE169, GLNI70, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LBU235, 
ALA236, PHE237, LBU238, LEU239, LEU240, ILE241, ALA242, GLY243, 
HES244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LBU283, 
THR287, HJ3288, ALA289, GLU290, THR291, ALA292, THR293, SER294, 
ARG295, PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LBU354, GLY355, GU^356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392, THR393, ILE394, or TYR395 of SEQ ID NO:2. 

14. An isolated mutant polypeptide of epothilone B hydroxylase 
comprising SEQ ID NO: 31, 33, 35, 61, 63, 65, 67, 69, 71, 73 or 75. 

15. An isolated variant polypeptide of epothilone B hydroxylase 
15 comprising SEQ ID NO: 43, 44, 45, 46, 47, 48 or 49. 

16. An isolated nucleic acid sequence encoding a feiredoxin. 

17. The isolated nucleic acid sequence of claim 16 comprising SEQ ID 

20 NO:3. 

18. A polypeptide encoded by the isolated nucleic acid sequence of claim 

16. 

25 19. An isolated nucleic acid molecule that is capable of hybridizing to the 

nucleic acid sequence set forth in SEQ ID NO:3, or to the complementary sequence of 
the nucleic acid sequence set forth in SEQ ID NO:3, under hybridization conditions of 
3X SSC at 65°C for 16 hours, said isolated nucleic acid molecule being capable of 
remaining hybridized to the nucleic acid sequence set forth in SEQ ID NO:3, or to the 

30 complementary sequence of the nucleic acid sequence set forth in SEQ ID NO:3, 
under wash conditions of 0.5X SSC, 55°C for 30 minutes. 

20. A vector comprising the isolated nucleic acid sequence of claim 1. 

35 21. The vector of claim 20 further comprising an isolated nucleic acid 

sequence encoding a ferredoxin. 
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22. A host cell comprising the vector of claim 20. 

23 . A host cell comprising the vector of claim 21 . 

5 

24. A method for producing recombinant microorganisms which 
hydroxylate epothilones having a terminal alkyl group to produce epothilones having 
a terminal hydroxyalkyl group, said method comprising transfecting a microorganism 
with the vector of claim 20 or 21. 

10 

25. A recombinantly produced microorganism that hydroxylates 
epothilones having a terminal alkyl group to produce epothilones having a terminal 
hydroxyalkyl group. 

15 26. The recombinantly produced microorganism of claim 25 wherein said 

microorganism expresses a nucleic acid sequence of SEQ ID NO: 1, 30, 32, 34, 36, 
37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 72 or 74. 

27. A method for the preparation of at least one epothilone of the 
20 following formula I 

HO-CH 2 -(A0 n -(Q) m -(A 2 ) o -E CO 

where 

Ai and A 2 are independently selected from the group of optionally substituted C1-C3 
alkyl and alkenyl; 

25 Q is an optionally substituted ring system containing one to three rings and at least 
one carbon to carbon double bond in at least one ring; 

n, m, and o are integers selected from the group consisting of zero and 1, where at 
least one of m or n or o is 1; and 
E is an epothilone core; 
30 comprising the steps of contacting at least one epothilone of the following formula II 

CH3-(Ai) a -(Q) m -(A2)o-E (CD 
where Ai, Q, A 2 , E, n, m, and o are defined as above; 
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with a recombinantly produced microorganism, or an enzyme derived therefrom, 
which is capable of selectively catalyzing the hydroxylation of Formula II, and 
effecting said hydroxylation. 



28. 



A method for the preparation of an epothilone analog of Formula A 
.S. L>C^ 



10 



OH O 

said method comprising biotransforming epothilone B to the epothilone analog of 
Formula A by incubation with a mutant epothilone B hydroxylase enzyme comprising 
SEQIDNO:3L 



29. A compound of Formula A 




15 



or a pharmaceutically acceptable salt thereof. 



30. A homology model of epothilone B hydroxylase having a root mean 
square deviation of conserved residue backbone atoms of less than about 4.0 A when 
superimposed on a corresponding backbone atoms described by structure coordinates 
20 listed in Appendix 1. 
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31. A method for producing a mutant with altered biological properties, 
fiinction, yield of a desired product, rate of reaction, substrate specificity, or activity 
as compared to epotbilone B hydroxylase, said method comprising the steps of: 
identifying an amino acid of SEQ ID NO:2 to mutate; and mutating the identified 

5 amino acid to create a mutant protein. 

32. The method of claim 3 1 wherein a homology model of epotbilone B 
hydroxylase having a root mean square deviation of conserved residue backbone 
atoms of less than about 4.0 A when superimposed on a corresponding backbone 

10 atoms described by structure coordinates listed in Appendix 1 is used to identify an 
amino acid of SEQ ID NO: 2 to mutate. 

33. The method of claim 31 wherein the identified amino acid is LEU39, 
GLN43, ALA45, MET57, LEU58, HLS62, PHE63, SER64, SER65, ASP66, ARG67, 

15 GLN68, SER69, LEU74, MET75, VAL76, ALA77, ARG78, GLN79, HJE80, ASP84, 
LYS85, PR086, PHE87, ARG88, PR089, SER90, LEU91, ILE92, ALA93, MBT94, 
ASP95, HIS99, ARG103, PHE110, ILE155, PHE169, GLN170, CYS172, SER173, 
SER174, ARG175, MET176, LEU177, SER178, ARG179, ARG186, PHE190, 
LEU193, VAL233, GLY234, LEU235, ALA236, PHE237, LEU238, LEU239, 

20 LEU240, ILE241, ALA242, GLY243, HIS244, GLU245, THR246, THR247, 
ALA248, ASN249, MET250, LEU283, THR287, ILE288, ALA289, GLU290, 
THR291, ALA292, THR293, SER294, ARG295, PHE296, ALA297, THR298, 
GLU312, GLY313, VAL314, VAL315, GLY316, VAL344, ALA345, PHE346, 
GLY347, PHE348, VAL350, HIS351, GLN352, CYS353, LEU354, GLY355, 

25 GLN356, LBU358, ALA359, GLU362, LYS389, ASP391, SER392, THR393, 
HE394, or TYR395 of SEQ ID NO:2. 



34. The method of claim 31 wherein the identified amino acid is GLU31, 
ARG67, ARG88. ILE92, ALA93, VAL106, ILE130, ALA140, MET176, PHE190, 
30 GLU 231, SER294, PHE237, or ILB365 of SEQ ID NO:2. 
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35. The method of claim 3 1 wherein the mutant protein improves yield of 
a desired product as compared to the yield of a desired product obtained using 
epothiloneB hydroxylase. 

5 36. The method of claim 35 wherein the desired product is epothilone F. 

37. The method of claim 3 1 wherein the mutant improves the rate of 
reaction as compared to the rate of reaction using epothilone B hydroxylase. 

10 38. The method of claim 3 1 wherein the mutant exhibits altered substrate 

specificity as compared to substrate specificity of epothilone B hydroxylase. 

39. The method of claim 38 wherein amino acid SER294 is mutated. 

15 40. The method of claim 3 1 wherein the mutant exhibits essentially 

similar biological activity or function to epothilone B hydroxylase. 

41. A machine-readable data storage medium comprising a data storage 
material encoded with structure coordinates set forth in Appendix 1 . 

20 
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Alignment used to design primers P450-l + and P450-la + 



STMSDACB 

STMSUBCB 

3702259 

SSU65940 

STMOLEP 

SERCP450A 



tcctcatcgccggccacgagac 
tgctggtcgccggccacgagac 
tgctcatcaccggccaggacac 
— ctgttcgccgggcacgactc 
tgc teat cgcgggccacgagac 
tgctggtcgccgggcacgagac 



(SEQ ID NO: 5) 
(SEQ ID NO: 6) 
(SEQ ID NO: 7) 
(SEQ ID NO: 8) 
(SEQ ID NO: 9) 
(SEQ ID NO: 10) 



Alignment used to design primers P450-2 + and P450-2" 



STMSUACB 

STMSUBCB 

3702259 

SSU65940 

STMOLEP 

SERCP450A 



cggcgcggtggaggaactgct 
gggcgccgtcgaggagctgct 
ccgcaccctggaggagctgct 
cggcgcggtcgaggagatgct 
cgcggcggtggaggagatgct 
cggcgcgatcgaggagaccct 



(SEQ ID NO: 11) 
(SEQ ID NO: 12) 
(SEQ ID NO: 13) 
(SEQ ID NO: 14) 
(SEQ ID NO: 15) 
(SEQ ID NO: 16) 



Alignment used to design primer P450-3" 

STMSUACB t tegget t cggcgtgcaccagtgcc tgggc 
STMSUBCB ttcggcttcggcgtccaccagtgcctggga 
3702259 ttcggctggggcccccaccactgcctgggc 
SSU65940 ttcggtcacggcgtccacaagtgtcctggc 
STMOLEP ttcgggcacggagcgcaccactgcatcggc 
SERCP450A ttcggccacggcatccacttctgcgtgggc 



(SEQ ID NO: 17) 
(SEQ ID NOtlS) 
(SEQ ID NO: 19) 
(SEQ ID NO: 20) 
(SEQ ID NO: 21) 
(SEQ ID NO: 22) 
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EPO-B MTDVEBTTATLPIiARKCPFSPP- -PEYERLRRESPVSRVGLPSGQTAWMiTRLEDIREML 

UINA ATVPDLESDS FHVDWYST YAELRETAPVTPVRPL -GQDAWLVTGYDKAKAAL 

**:* .* . * .**. * :. ** ** :* :: : * 

EPO-B SSPHFSSD--RQSPSFPLMVARQI--RI^DKP-F 
UINA SDLRI^SDPEKKYPGVEVEPPAYLGPPEDVRNYFATNMGT 

*. :: *.. : .. : .::*..:: *** * : *: * **** : 

EPO-B RMKAIOPRIQQIVDEHIDAUAGPKPADLVQALSM 
1JIKA RVEAMRPRVEQITAELIiDEV-GDSGVVDIV^ 

*::*::**::**. * :* : ... .*:*: :: *:* ******** . * * 

EPO-B SRMI^REVT-AEERMTAFESLENYUJELVTKK^^ 

UINA S E I LVMD PERAEQRGOAAREVVNF I LDLVERRRTE PGDDLL SALI SVQD 

*.;* ; **;* * ... *.. .** ...... ;* * . ;;; ;.*. . .** 

EPO-B VGLAFLIiL IAGHETTANMISLGTVTIiliE^ EELLRI PTIAETA 

UINA TSIALVLLLAGFEASVSLIGIGTYLIJjTHPDQLALVRADPSALPNAVEEILRYI^ 

-.:*::**:**.*::..:*.:** ** •.***** ::***, *.**.*♦ :: .** : 

EPO-B TSRFATADVEIGGTLIRAGEGWGLSNAGNHDPIX5FENPDTFDIERGARHHVAFGFGVHQ 
UINA T-RFAAEEOTIGGVAIPQYSTVLVANGAANRDPSQFPDPHRFDVTW>^ 

* ***. .*****. * . * ; ..*.*.**. * ; * 4 **. *..* *..** *-* 

EPO-B a^NLARIiEIiQIVFDTLFRRVPGIRIAVPVDELP - 

UINA CMGRPLAKLEGEVALRALFGRFPALSLGIDi^^ 

*:*:**:**::.::****• * *• *** 
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SEQUENCE LISTING 
<110> Bristol-Myers Squibb Company 

<120> COMPOSITIONS AND METHODS FOR HYDROXYLATING EPOTHILONES 

<130> D0231 PCT2 

<150> US 10/321,188 

<151> 2002-12-17 

<160> 76 

<170> Patentln version 3.1 

<210> 1 
<211> 1186 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 1 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcgtt cctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 
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gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacg 1186 



<210> 2 
<211> 404 
<212> PRT 

<213> Amycolatopsis oriental is 
<400> 2 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyx Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 " 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



2 
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Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



' Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 
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<210> 3 
<211> 195 
<212> DNA 

<213> Araycolatopsis orientalis 
<400> 3 

atgaagatca tcgcggacac cgggaagtgc gtgggggcgg gccagtgcgt gctcaccgat 60 
cccgatctgt tcgaccagag cgaggacgac gggacggtcc tcctgctgaa cgccgagccc 120 
gaaggcgaag aggcggagga gaacgcgcgc accgccgtgc acatctgccc ggggcaggca 180 
ctttcgctcg cgtag 195 



<210> 4 
<211> 64 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 4 

Met Lys lie He Ala Asp Thr Gly Lys Cys Val Gly Ala Gly Gin Cys 
15 10 15 



Val Leu Thr Asp Pro Asp Leu Phe Asp Gin Ser Glu Asp Asp Gly Thr 
20 25 30 



Val Leu Leu Leu Asn Ala Glu Pro Glu Gly Glu Glu Ala Glu Glu Asn 
35 40 * 45 



Ala Arg Thr Ala Val His He Cys Pro Gly Gin Ala Leu Ser Leu Ala 
50 55 60 



<210> 5 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 5 

tcctcatcgc cggccacgag ac 22 



<210> 6 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 



4 
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<223> Synthetic 
<400> 6 

tgctggtcgc cggccacgag ac 22 



<210> 7 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 7 

tgctcatcac cggccaggac ac 22 



<210> 8 

<211> 20 

<212> . DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 8 

ctgttcgccg ggcacgactc 20 



<210> 9 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 9 

tgctcatcgc gggccacgag ac 22 



<210> 10 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 10 

tgctggtcgc cgggcacgag ac 22 



<210> 11 
<211> 21 
<212> DNA 



5 
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<213> 



Artificial sequence 



<220> 
<223> 



Synthetic 



<400> 11 

cggcgcggtg gaggaactgc t 



21 



<210> 12 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 12 

gggcgccgtc gaggagctgc t 21 

<210> 13 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 14 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 14 

cggcgcggtc gaggagatgc t 21 



<210> 15 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<400> 13 

ccgcaccctg gaggagctgc t 



21 



<400> 15 

cgcggcggtg gaggagatgc t 



21 



6 
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<210> 16 

<211> 21 

<212> DNA 

<213> Artificial 



sequence 



<220> 

<223> Synthetic 
<400> 16 

cggcgcgatc gaggagaccc t 



<210> 17 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<400> 17 

ttcggcttcg gcgtgcacca gtgcctgggc 

<210> 18 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 18 

ttcggcttcg gcgtccacca gtgcctggga 

<210> 13 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 19 

ttcggctggg gcccccacca ctgcctgggc 



<210> 20 

<211> 30 

<212> DNA 

<213> Artificial 

<220> 

<223> Synthetic 

<400> 20 



sequence 



7 
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ttcggtcacg gcgtccacaa gtgtcctggc 



30 



<210> 21 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<400> 21 

ttcgggcacg gagcgcacca ctgcatcggc 



30 



<210> 22 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 22 

ttcggccacg gcatccactt ctgcgtgggc 



30 



<210> 23 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<400> 23 

tgctgctsdt cgccggbcab gasac 



25 



<210> 


24 


<211> 


25 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


<«.. (9) 


<223> 


n=a, c, g or t 


<400> 


24 



tgmtssysnt cgscgsbcay gasac 



25 



8 
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<210> 25 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 25 

cggvgcsvts gaggarmtgc tgcg 24 



<210> 26 

<211> 24 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 27 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<40p> 27 

gcccaggcas ahcacsywg gcdybggctt 30 

<210> 28 

<211> 27 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 29 

<2U> 27 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 29 



<400> 26 

cgcagcakyt cctcsabsgc bccg ■ 



24 



<400> 28 

gcgagatcta cctggggaag gacaacc 



27 



9 
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gcgaagctta cggacttgga ccctacg 27 

<2X0> 30 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 30 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

l 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agagctatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcgtt cttgctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac. catcgcggag acggcgaccc cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 12 15 

<210> 31 
<211> 404 
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<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 31 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 ** 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 no 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 .135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Ser Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
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195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Pro Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 32 
<211> 1215 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 32 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgtac gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc cgcctggtcg gtctggcgtt cctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcetgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 

<210> 33 

<211> 404 

<212> PRT 

. <213> Artificial sequence 

<220> 

<223> Synthetic 
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<400> 33 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu lie Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Glri Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Tyr Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 
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Gly Glu Ala Asp His Gly Arg Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 

Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 34 

<2H> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
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<400> 34 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 

<210> 35 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 35 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg I»ys 



16 



WO 2004/061116 



PCT/US2003/034082 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro Hie Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 ' 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser . Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu lieu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 
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He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
3S5 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 36 
<211> 1104 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 36 

gcgaccttgc cgctggcccg caaatgcccg ttttcaccgc cgcccgaata cgagcggctt 60 
cgccgggaaa gtccggtttc ccgggtcggt ctcccgtccg gtcaaaccgc ttgggcgctc 120 
acccggctcg aggacatccg cgaaatgctg agcagtccgc atttcagctc cgaccggcag 180 
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agtccgtcgt 


tcccgctgat 


ggtggcccgg cagatccggc gcgaggacaa gccgttccgc 


240 


ccgtccctca 


tcgcgatgga 


cccgccggaa cacagcaagg ccaggcgtga cgtcgtcggg 


300 


gaattcaccg 


tcaagcgcat 


gaaagcgctt cagccgcgta ttcagcagat cgtcgacgag 


360 


catatcgacg 


ccatgctcgc 


cggccccaaa cccgccgatc tcgtccaggc gctttccctg 


420 


ccggttccgt 


ccttggtgat 


ctgcgaactg ctcggtgtcc cctattcgga ccacgagttc 


480 


ttccagtcct 


gcagttcccg 


gatgctcagc cgggaagtca ccgccgaaga acggatgacc 


540 


gcgttcgagt 


cgctcgagaa 


ctatctcgac gaactcgtca cgaagaagga ggcgaacgcc 


600 


accgaggacg 


acctcctcgg 


ccgccagatc ctgaagcagc gcgaaacggg cgaagccgac 


660 


cacggcgaac 


tcgtcgggct 


ggcgttcctg ctgctcatcg cgggacacga gacgacggcg 


720 


aacatgatct 


cgctcggcac 


ggcgaccctg ctggagaacc ccgaccagct ggcgaagatc 


780 


aaggccgatc 


cgggcaagac 


cctcgccgcg atcgaggagc tcctgcgggt cttcaccatc 


840 


gcggagacgg 


cgacctcacg 


cttcgccacg gcggacgtcg agatcggcgg cacgctcatc 


900 


cgcgcgggtg 


aaggcgtcgt 


cggcctgagc aacgcgggca accacgatcc ggaaggcttc 


960 


gagaacccgg 


acgccttcga 


catcgaacgc ggcgcgcggc accacgtcgc cttcggattc 


1020 


ggtgtgcacc 


aatgcctcgg 


ccagaacttg gcgaggttgg aactccagat cgtgttcgat 


1080 


acgttgttcc 


ggcgagtgcc 


gggc 


1104 



<210> 37 
<211> 1103 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 37 

gaccttgccg ctggcccgga aatgcccgtt ttcgccgccg cccgaatacg aacggcttcg 60 
ccgggaaagt ccggtttccc gggtcggtct cccgtccggt caaacggctt gggcgctcac 120 
ccggctcgaa gacatccgcg aaatgctgag cagcccgcat ttcagttccg accggcagag 180 
cccgtcgttc ccgctgatgg tcgcgcggca gatccgccgc gaggacaagc cgttccgccc 240 
ctccctcatc gcgatggatc cgccggaaca cagccgggcc aggcgtgacg tcgtcgggga 300 
attcaccgtc aagcggatga aggcgctcca gccgcgaatt cagcagatcg tcgacgaaca 360 
tctcgacgcc ctgctcgcgg gccccaaacc cgccgatctc gtccaggcgc tttccctgcc 420 
cgttccctcg ctggtgatct gcgaactgct cggcgtcccc tattcggacc acgagttctt 480 
ccagtcctgc agttccagga tgctcagccg ggaggtcacc gccgaagaac ggatgaccgc 540 
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gttcgagcag ctcgaaaact atctcgacga actggtcacc aagaaggagg cgaacgccac 600 

cgaggacgac ctcctcggcc gtcagatcct gaaacagcgg gaaacgggcg aggccgacca 660 

cggtgaactc gtcgggctgg cgttcctgct gctcatcgcc ggacacgaga ccacggcgaa 720 

catgatctcg ctcggcacgg tgaccctgct ggagaatccc gatcagctcg cgaagatcaa 780 

ggcagacccc ggcaagaccc tcgccgccat cgaggaactc ctgcgggtct tcacgatcgc 840 

ggaaacggcg acctcacgct tcgccacggc ggacgtcgag atcggcggaa cgctgatccg 900 

. cgcgggggaa ggggtggtgg gcctgagcaa cgcgggcaac cacgatccgg acggcttcga 960 

gaacccggac accttcgaca tcgaacgcgg cgcgcggcat cacgtcgcgt tcggattcgg 1020 

ggtgcaccag tgtctcggcc agaacttggc gaggttggaa ctccagafccg tctjtcgatac 1080 

gttgttccgg cgagtgccgg gcc 1103 

<210> 38 
<211> 817 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 38 

cttcacccgc gcggatgagc gtgccgccga tctcgacgtc cgccgtggcg aagcgtgagg 60 

tcgccgtctc cgcgatggtg aagatccgca ggagttcctc gatcgcggcg agggtcttgc 120 

ccggatccgc cttgatcttc gccagctgat cggggttctc cagcagggtc accgtgccga 180 

gcgagatcat gttcgccgta gtctcgtgcc ccgcgatgag caggaggaac gccagaccga 240 

ccagttcgcc gtggtcggct tcgccggatt cgcgctgctt caggatctgg cggccgagga 300 

ggtcgtcctc ggtggcgttc gcctccttct tcgtgacgag ttcgtcgaga tagttctcga 360 

gcgactcgaa cgcggtcatc cgttcttcgg cggtgacttc ccggctgagc atccgggaac 420 

tgcaggactg gaagaactcg tggtccgaat aggggacacc gagcagttcg cagatcacca 480 

aggacggaac cggcagggaa agcgcctgga cgagatcggc gggtttgggg ccggcgagca 540 

gggcgtcgat atgctcgtcg acgatctgct gaatacgtgg ctgaagcgct ttcatgcgct 600 

tgacggtgaa ttccccgacg acgtcacgcc tggccttgcc gtgttccggc gggtccatcg 660 

cgatgaggga cgggcggaac ggcttgtcct cgcgccggat ctgccgcgcc accatcagcg 720 

ggaacgacgg actctgccgg tcggagctga aatgcggact gctcagcatt tcgcggatgt 780 

cttcgagccg ggtgagcgcc caagcggttt gaccgga 817 

<210> 39 
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<211> 1105 
<212> DNA 

<213> Amycolatopsis orientalie 
<400> 39 

ccgcgacctt gccgctggcc cgcaaatgcc cgttttcacc gccgcccgaa tacgagcggc 60 

ttcgccggga aagtccggtt tcccgggtcg gtctcccgtc cggtcaaacc gcttgggcgc 120 

tcacccggct cgaggacatc cgcgaaatgc tgagcagtcc gcatttcagc tccgaccggc 180 

agagtccgtc gttcccgctg atggtggccc ggcagatccg gcgcgaggac aagccgttcc 240 

gcccgtccct catctcgatg gacccgccgg aacacagcaa ggccaggcgt gacgtcgtcg 300 

gggaattcac cgtcaagcgc atgaaagcgc ttcagccgcg tattcagcag atcgtcgacg 360 

agcatatcga cgccctgctc gccggcccca aacccgccga tctcgtccag gcgctttccc 420 

tgccggttcc gtccttggtg atctgcgaac tgctcggtgt cccctattcg gaccacgagt ! 480 

tcttccagtc ctgcagttcc cggatgctca gccgggaagt caccgccgaa gaacggatga 540 

ccgcgttcga gtcgctcgag aactatctcg acgaactcgt cacgaagaag gaggcgaacg 600 

ccaccgagga cgacctcctc ggccgccaga tcctgaagca gcgcgaaacg ggcgaagccg 660 

accacggcga actggtcggg ctggcgttcc tcctgctcat cgcgggacac gagacgacgg 720 

cgaacatgat ctcgctcggc acggcgaccc tgctggagaa ccccgaccag ctggcgaaga 780 

tcaaggccga tccgggcaag accctcgccg cgatcgagga gctcctgcgg gtcttcacca 840 

tcgcggagac ggcgacctca cgcttcgcca cggcggacgt cgagatcggc ggcacgctca 900 

tccgcgcggg tgaaggcgtc gtcggcctga gtaacgcggg caaccacgat ccggaaggct 960 

tcgagaaccc ggacgccttc gacatcgaac gcggcgcgcg gcaccacgtc gccttcggat 1020 

tcggtgtgca ccaatgcctc ggccagaact tggcgaggtt ggaactccag atcgtgttcg 1080 

atacgttgtt ccggcgagtg ccggg 1105 

<210> 40 
<211> 1304 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 40 

ccttgccact ggcccgcaaa tgcccgtttt caccaccgcc cgaatacgag cggctccgcc 60 

gggaaagtcc ggtttcccgg gtcggtctcc cctccggtca aaccgcttgg gcgctcaccc 120 

ggctcgaaga catccgcgaa atgctgagca gtccgcattt cagctccgac cggcagagtc 180 

cgtcgttccc gctgatggtg gcgcggcaga tccggcgcga ggacaajgccg ttccgcccgt 240 
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ccctcatcgc 


gatggacccg 


ccggaacacg gcaaggccag gcgtgacgtc gtcggggaat 


300 


tcaccgtcaa 


gcgcatgaaa 


gcgcttcagc cacgtattca gcagatcgtc gacgagcata 


360 


tcgacgccct 


gctcgccggc 


cccaaacccg ccgatctcgt ccaggcgctt tccctgccgg 


420 


ttccgtcctt 


ggtgatctgc 


gaactgctcg gtcftccccta ttcggaccac gagttcttcc 


480 


agtcctgcag 


ttcccggatg 


ctcagccggg aagtcaccgc cgaagaacgg atgaccgcgt 


540 


tcgagtcgct 


cgagaactat 


ctcgacgaac tcgtcacgaa gaaggaggcg aacgccaccg 


600 


aggacgacct 


cctcggccgc 


cagatcctga agcagcgcga atccggcgaa gccgaccacg 


660 


gcgaactggt 


cggtctggcg 


ttcctcctgc tcatcgcggg gcacgagact acggcgaaca 


720 


tgatctcgct 


cggcacggtg 


accctgctgg agaaccccga tcagctggcg aagatcaagg 


780 


cggatccggg 


caagaccctc 


gccgcgatcg aggaactcct gcggatcttc accatcgcgg 


840 


agacggcgac 


ctcacgcttc 


gccacggcgg acgtcgagat cggcggcacg ctcatccgcg 


900 


cgggtgaagg 


cgtcgtcggc 


ctgagcaacg cgggcaacca cgatccggac ggcttcgaga 


960 


acccggacac 


cttcgacatc 


gaacgcggcg cgcggcatca cgtcgccttc ggattcggtg 


1020 


tgcaccaatg 


cctcggccag 


aacttggcga ggttggaact ccagatcgtg ttcgatacgt 


1080 


tgttccggcg 


agtgccgggc 


atccggatcg ccgtaccggt cgacgaactg ccgttcaagc 


1140 


acgattcgac 


gatctacggc 


ctccgcgccc tgccggtcac ctggtaggag gagccatgaa 


1200 


gatcatcgcg 


gacaccggga 


agtgcgtggg ggcgggccag tgcgtgctca ccgatcccga 


1260 


tctgttcgac 


cagagcgagg 


acgacgggac ggtcctcctg ctga 


1304 



<210> 41 
<211> 825 
<212> DNA 

<213> Amycolatopsie orientalis 
<400> 41 

ctccggtcaa accgcttggg cgctcacccg gctcgaagac atccgcgaaa tgctgagcag 60 
tccgcatttc agctccgacc ggcagaatcc gtcgttcccg ctgatggtgg cgcggcagat 120 
ccggcgcgag gacaagccgt tccgcccgtc cctcatcgcg atggacccgc cggaacacag 180 
caaggccagg cgtgacgtcg tcggggaatt caccgtcaag cgcatgaaag cgcttcagcc 240 
gcgtattcag cagatcgtcg acgagcatat cgacgccctg ctcgccggcc ccaaacccgc 300 
cgatctcgtc caggcgcttt ccctgccggt tccgtccttg gtgatctgcg aactgctcgg 360 
tgtcccctat tcggaccacg agttcttcca gtcctgcagt tcccggatgc tcagccggga 420 
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agtcaccgcc gaagaacgga tgaccgcgtt cgagtcgctc gagaactatc tcgacgaact 480 

cgtcacgaag aaggaggcga acgccaccga ggacgacctc ctcggccgcc agatcctgaa 540 

gcagcgggaa acgggcgagg ccgaccacgg cgaactcgtc gggctggcgt tcctgctgct 600 

catcgccggg cacgagacga cggcgaacat gatctcgctc ggcacggcga ccctgctgga 660 

gaaccccgac cagctggcga agatcaaggc ggatccgggc aagaccctcg ccgcgatcga 720 

ggaactgctg cgcgtcttca cgatcgcgga gacggcgacc tcacgcttcg ccacggcgga 780 

cgtcgagatc ggcggcacgc tcatccgcgc gggtgaaggc gtcgt 825 

<210> 42 
<211> 1103 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 42 

gcgaccttgc cactggcccg caaatgcccg ttttcaccac cgcccgaata cgagcggctc 60 

cgccgggaaa gtccggtttc ccgggtcggt ctcccctccg gtcaaaccgc ttgggcgctc 120 

acccggctcg aagacatccg cgaaatgctg agcagtccgc atttcagctc cgaccggcag 180 

agtccgtcgt tcccgctgat ggtggcgcgg cagatccggc gcgaggacaa gccgttccgc 240 

ccgtccctca tcgcgatgga cccgccggaa cacggcaagg ccaggcgtga cgtcgtcggg 300 

gaattcaccg tcaagcgcat gaaagcgctt cagccacgta ttcagcagat cgtcgacgag 360 

catatcgacg ccctgctcgc cggccccaaa cccgccgatc tcgtccaggc gctttccctg 420 

ccggttccgt ccttggtgat ctgcgaactg ctcggtgtcc cctattcgga ccacgagt'tc 480 

ttccagtcct gcagttcccg gatgctcagc cgggaagtca ccgccgaaga acggatgacc . 540 

gcgttcgagt cgctcgagaa ctatctcgac gaactcgtca cgaagaagga ggcgaacgcc 600 

accgaggacg acctcctcgg ccgccagatc ctgaagcagc gcgaatccgg cgaagccgac 660 

cacggcgaac tggtcggtct ggcgttcctc ctgctcatcg cggggcacga gactacggcg 720 

aacatgatct cgctcggcac ggtgaccctg ctggagaacc ccgatcagct ggcgaagatc 780 

aaggcggatc cgggcaagac cctcgccgcg atcgaggaac tcctgcggat cttcaccatc 840 

gcggagacgg cgacctcacg cttcgccacg gcggacgtcg agatcggGgg cacgctcatc 900 

cgcgcgggtg aaggcgtcgt cggcctgagc aacgcgggca accacgatcc ggacggcttc 960 

gagaacccgg acaccttcga catcgaacgc ggcgcgcggc atcacgtcgc cttcggattc 1020 

ggtgtgcacc aatgcctcgg ccagaacttg gcgaggttgg aactccagat cgtgttcgat 1080 
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acgttgttcc ggcgagtgcc ggg 1103 



<210> 43 

<211> 402 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 43 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu lie Ala Met Asp Pro 
85 90 95 



Pro Glu His Ser Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Met Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Tor Ala Phe Glu Ser 
180 185 190 
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Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu lieu Gly Arg Gin lie Leu Lys Gin Arg Glu Thr 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Ala 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg Val Phe Thr lie 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Glu Gly Phe Glu Asn Pro Asp Ala Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val 
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<210> 44 

<211> 367 

<212> PRT 

<213> Araycolatopsis oriental is 

<400> 44 

Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu Tyr 
1 5 10 15 



Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro Ser 
20 25 30 



Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu Met 
35 40 45 



Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe Pro 
50 55 60 



Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg Pro 
65 70 75 80 



Ser Leu lie Ala Met Asp Pro Pro Glu His Ser Arg Ala Arg Arg Asp 
85 90 95 



Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro Arg 
100 105 110 



He Gin Gin He Val Asp Glu His Leu Asp Ala Leu Leu Ala Gly Pro 
115 120 125 



Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser Leu 
130 135 140 



Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe Phe 
145 150 155 160 



Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu Glu 
165 170 175 



Arg Met Thr Ala Phe Glu Gin Leu Glu Asn Tyr Leu Asp Glu Leu Val 
180 185 190 



Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg Gin 
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195 200 205 



He Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu Val 
210 215 220 



Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala Asn 
225 230 235 240 



Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin Leu 
245 250 255 



Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu Glu 
260 265 270 



Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe Ala 
275 280 285 



Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu Gly 
290 295 300 



Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe Glu 
305 310 315 320 



Asn Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val Ala 
325 330 335 



Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu 
340 345 350 



Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro Gly 
355 360 ~ 365 



<210> 45 
<211> 272 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 45 

Ser -Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
15 10 15 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
20 25 ~ " 30 
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Pro Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg 
35 40 45 



Pro Ser Leu He Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg 
50 55 60 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
65 70 75 80 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
85 90 95 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
100 105 110 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
115 120 125 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
130 135 140 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
145 150 155 160 



Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
165 170 175 



Gin He Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu 
180 185 190 

Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
195 200 205 



Asn Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin 
210 215 220 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
225 230 235 240 



Glu Leu Leu Arg He Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
245 250 255 
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Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
260 265 270 



<210> 46 
<211> 367 
<212> PRT 

<213> Araycolatopsis orientalis 
<400> 46 

Ala Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu 
15 10 15 



Tyr Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro 
20 25 30 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
' 35 40 45 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
50 55 60 



Pro Leu Met Val Ala Arg Gin He Arg Arg Glu Asp Lys Pro Phe Arg 
65 70 75 80 



Pro Ser Leu He Ser Met Asp Pro Pro Glu His Ser Lys Ala Arg Arg 
85 90 95 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
100 105 110 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
115 120 125 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
130 135 140 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
145 150 155 160 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
165 170 175 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
180 185 190 
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Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
195 200 205 



Gin He Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu 
210 215 220 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
225 230 235 240 



Asn Met He Ser Leu Gly Thr Ala Thr Leu Leu Glu Asn Pro Asp Gin 
245 250 255 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
260 265 270 



Glu Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
275 280 285 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu lie Arg Ala Gly Glu 
290 295 300 



Gly Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Glu Gly Phe 
305 310 315 320 



Glu Asn Pro Asp Ala Phe Asp He Glu Arg Gly Ala Arg His His Val 
325 330 335 



Ala Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg 
340 345 350 



Leu Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro 
355 360 365 



<210> 47 
<211> 394 
<212> PRT 

<213> Amycolatopsis oriental is 
<400> 47 

Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu 
15 10 15 
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Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro Ser Gly 
20 25 30 



Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu Met Leu 
35 40 45 



Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu 
50 55 60 



Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser 
65 70 75 80 



Leu lie Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg Asp Val 
85 90 95 



Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro Arg lie 
100 105 110 



Gin Gin lie Val Asp Glu His lie Asp Ala Leu Leu Ala Gly Pro Lys 
115 120 125 



Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser Leu Val 
130 135 140 



lie Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe Phe Gin 
145 150 155 160 



Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu Glu Arg 
165 170 175 



Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr 
180 185 190 



Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg Gin lie 
195 200 205 



Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu Val Gly 
210 215 220 



Leu Ala Phe Leu Leu Leu lie Ala Gly His Glu Thr Thr Ala Asn Met 
225 230 235 240 



lie Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala 



31 



WO 2004/061116 



PCT/US2003/034082 



245 250 255 



Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu Glu Leu 
260 265 270 



Leu Arg He Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr 
275 280 285 



Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu Gly Val 
290 295 300 



Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe Glu Asn 
305 310 315 320 



Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val Ala Phe 
325 330 335 



Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu 
340 345 350 



Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro Gly He Arg 
355 360 ^ 365 



He Ala Val Pro Val Asp Glu Leu Pro Phe Lys His Asp Ser Thr He 
370 375 380 



Tyr Gly Leu Arg Ala Leu Pro Val Thr Trp 



385 


390 


<210> 


48 


<211> 


274 


<212> 


PRT 


<213> 


Araycolatopsis oriental is 


<400> 


48 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
15 10 15 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Asn Pro Ser Phe 
20 25 30 



Pro Leu Met Val Ala Arg Gin He Arg Arg Glu Asp Lys Pro Phe Arg 
35 40 45 
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Pro Ser Leu lie Ala Met Asp Pro Pro Glu His Ser Lys Ala Arg Arg 
50 55 60 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
65 70 75 80 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
85 90 95 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
100 105 110 



Leu Val lie Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
115 120 125 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
130 135 140 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
145 150 155 160 



Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
165 170 175 



Gin He Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu 
180 185 190 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
195 200 205 



Asn Met He Ser Leu Gly Thr Ala Thr Leu Leu Glu Asn Pro Asp Gin 
210 215 220 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala lie Glu 
225 230 235 240 



Glu Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
245 250 255 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
260 265 270 
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Gly Val 



<210> 49 

<211> 367 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 49 



Ala Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu 
1 5 10 15 



Tyr Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro 
20 25 30 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
35 40 45 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
50 55 60 



Pro Leu Met Val Ala Arg Gin He Arg Arg Glu Asp Lys Pro Phe Arg 
65 70 75 80 



Pro Ser Leu lie Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg 
85 90 95 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
100 105 110 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
115 120 125 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
130 135 140 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
145 150 155 160 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
165 170 175 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
180 185 190 
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Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
195 200 205 



Gin He Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu 
210 215 220 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
225 230 235 240 



Asn Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin 
245 250 255 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
260 265 270 



Glu Leu Leu Arg lie Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
275 280 285 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
290 295 300 



Gly Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe 
305 310 315 320 



Glu Asn Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val 
325 330 335 



Ala Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg 
340 345 350 



Leu Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro 
355 360 365 



<210> 


50 


<211> 


25 


<212> 


DNA 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


50 



aggaaaccac cgcgaccttg ccact 25 



35 



WO 2004/061116 



PCT/US2003/034082 



<210> 51 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 51 

accgaatccg aaggcgacgt gatgc 25 



<210> 52 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 53 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 53 

tgatcttcat ggctcctcct acc 23 



<210> 54 

<211> 35 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<221> misc_feature 

<222> (18).. (20) 

<223> n=a, c, g or t 



<400> 52 

cggaatgaat ccatccgcat acg 



23 



<400> 54 

gcgaagccga ccacggcnnn ctggtcggtc tggcg 



35 



<210> 55 
<211> 35 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<22X> misc_feature 

<222> (16).. (18) 

<223> n=a, c, g or t 



<400> 55 

cgccagaccg accagnnngc cgtggtcggc ttcgc 



35 



<210> 


56 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(14) .. (14) 


<223> 


n=a, c, g or t 


<400> 


56 


ggtcggtctg gcgnysctcc tgctcatcgc 


<210> 


57 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(22) . . (22) 


<223> 


n=a, c, g or t 



35 



<400> 57 

gccccgcgat gagcaggags rncgccagac cgacc 35 

<210> 58 

<211> 35 

<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Synthetic 
<220> 

<221> mis cofeature 

<222> (17) . . (17) 

<223> n=a, c f g or t 



<400> 58 

ggtcggtctg gcgttcnysc tgctcatcgc ggggc 35 



<210> 


59 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(19) . . (19) 


<223> 


n=a, c, g or t 


<400> 


59 



gccccgcgat gagcagsrng aacgccagac cgacc 35 



<210> 60 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 60 



atgaccgacg 


tcgaggaaac 


caccgcgacc ttgccactgg cccgcaaatg cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 


180 


ccgcatttca 


gctccgacca 


gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc accgtcaagc gcatgaaagc gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc gacgccctgc tcgccggccc caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt ccgtccttgg tgatctgcga actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt cccggatgct cagccgggaa 


540 
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gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 


B40 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 61 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 61 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 

Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 

Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 

Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 

Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
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85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 
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Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp lie 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin lie Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly lie Arg lie Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 62 
<211> 1215 
<212> DMA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 62 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 
cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcggga tggacccgcc ggaacacggc 300 
aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 
cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 
gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 
gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 
gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 
gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 
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cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agaccgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 



<210> 63 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 63 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Gly Met Asp Pro 
85 90 * 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 HO 
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Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 
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Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 

Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin Thr Val Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly lie Arg He Ala Val Pro Val Asp 
370 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 



<210> 64 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 64 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgccgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggca gatccgggca agaccctcgc cgcgatcgag 840 
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gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 


65 


<211> 


404 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


65 



Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Ala Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 
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His lie Asp Ala lieu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 
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Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin lie Val Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly lie Arg lie Ala Val Pro Val Asp 
370 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 

<210> 66 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 66 



atgaccgacg 


tcgaggaaac 


caccgcgacc 


ttgccactgg 


ctcgcaaatg 


cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg 


gaaagtccgg 


tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg 


ctcgaagaca 


tccgcgaaat 


gctgagcagt 


180 


ccgcatttca 


gctccgaccg 


gcagagtccg 


tcgttcccgc 


tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccacccgtcc 


ctcgtcgcga 


tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc 


gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc 


tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg 


tgatctgcga 


actgctcggt 


460 


gtcccctatt 


cggaccacga 


gttcttccag 


tcctgcagtt 


cccggatgct 


cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc 


gagtcgctcg 


agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag 


gacgacctcc 


tcggccgcca 


gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc 


gaactggtcg 


gtctggcggc 


gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg 


atctcgctcg 


gcacggtgac 


cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg 


gatccgggca 


agaccctcgc 


cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag 


acggcgacct 


cacgcttcgc 


cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg 


ggtgaaggcg 


tcgtcggcct 


gagcaacgcg 


960 
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J 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 



<210> 


67 


<211> 


404 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


67 



Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 ~ 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe His Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



48 



WO 2004/061116 



PCT/US2003/034082 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 

Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 

Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
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370 



375 



380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr 6ly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 68 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Syntlietic 

<400> 68 



atgaccgacg 


tcgaggaaac 


caccgcgacc 


ttgccactgg 


cccgcaaatg 


cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg 


aaaagtccgg 


tttcccgggt 


cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg 


ctcgaagaca 


tccgcgaaat 


gctgagcagt 


180 


ccgcatttca 


gctccgaccg 


gcagagtccg 


tcgttcccgc 


tgatggtggc 


gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc 


ctcatcgcga 


tggacccgcc 


ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc 


accgtcaagc 


gcatgaaagc 


gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc 


gacgccctgc 


tcgccggccc 


caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt 


ccgtccttgg 


tgatctgcga 


actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag 


tcctgcagtt 


cccggatgct 


cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc 


gagtcgctcg 


agaactatct 


cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag 


gacgacctcc 


tcggccgcca 


gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc 


gaactggtcg 


gtctggcgtt 


cctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg 


atctcgctcg 


gcacggtgac 


cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg 


gatccgggca 


agaccctcgc 


cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag 


acggcgacct 


cacgcttcgc 


cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg 


ggtgaaggcg 


tcgtcggcct 


gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac 


ccggacacct 


tcgacatcga 


acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg 


caccaatgcc 


tcggccagaa 


cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg 


ttccggcgag 


tgccgggcat 


ccggatcgcc 


1140 
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gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 
ccggtcacct ggtag 1215 



<210> 69 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 69 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Lys Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
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165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 ^ 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly lie Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 
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Pro Val Thr Trp 



<210> 


70 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


raisc_feature 


<222> 


(20) (21) 


<223> 


n=a, c, g or t 



<400> 70 

gttccgcccg tccctcgtcn nsatggaccc gccgg 35 



<210> 


71 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(15) . . (16) 


<223> 


n=a, c, g or t 


<400> 


71 



cctgcagttc ccggnnsctc agccgggaag tcacc 35 

<210> 72 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 72 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaga ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
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ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatacc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccgggcgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gacccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggeggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 

<210> 73 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 73 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 
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Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His Thr Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Ala 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 
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Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg lie Phe Thr lie 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 - 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 

<210> 74 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 74 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 
cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 
aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaggc gcttcagcca 360 
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cgtattcagc 


agatcgtcga cgagcatatc 


gacgccctgc tcgccggccc caaacccacc 


420 


gatctcgtcc 


aggcgctttc cctgccggtt 


ccgtccttgg tgatctgcga actgctcggt 


480 


gtcccctatt 


cggaccacga gttcttccag 


tcctgcagtt cccggtcgct cagccgggaa 


540 


gtcaccgccg 


aagaacggat gaccgcgttc 


gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa cgccaccgag 


gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc cgaccacggc 


gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac ggcgaacatg 


atctcgctcg . gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa gatcaaggcg 


gacccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac catcgcggag 


acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct catccgcgcg 


ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg cttcgagaac 


ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg attcggtgtg 


caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt cgatacgttg 


ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc gttcaagcac 


gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



75 
404 
PRT 

Artificial sequence 
<220> 

<223> Synthetic 
<400> 75 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 • 55 60 
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Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Thr Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Ser 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 
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Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu lie Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 76 
<211> 404 
<212> PRT 

<213> Saccharopolyspora erythaea 
<400> 76 

Met Thr Thr Val Pro Asp Leu Glu Ser Asp Ser Phe His Val Asp Trp 
15 10 15 



Tyr Arg Thr Tyr Ala Glu Leu Arg Glu Thr Ala Pro Val Thr Pro Val 
20 25 30 



Arg Phe Leu Gly Gin Asp Ala Trp Leu Val Thr Gly Tyr Asp Glu Ala 
35 40 45 



Lys Ala Ala Leu Ser Asp Leu Arg Leu Ser Ser Asp Pro Lys Lys Lys 
50 55 60 



Tyr Pro Gly Val Glu Val Glu Phe Pro Ala Tyr Leu Gly Phe Pro Glu 
65 70 75 80 
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Asp Val Arg Asn Tyr Phe Ala Thr Asn Met Gly Thr Ser Asp Pro Pro 
85 90 95 



Thr His Thr Arg Leu Arg Lys Leu Val Ser Gin Glu Phe Thr Val Arg 
100 105 110 



Arg Val Glu Ala Met Arg Pro Arg Val Glu Gin lie Thr Ala Glu Leu 
115 120 125 



Leu Asp Glu Val Gly Asp Ser Gly Val Val Asp lie Val Asp Arg Phe 
130 135 140 



Ala His Pro Leu Pro lie Lys Val lie Cys Glu Leu Leu Gly Val Asp 
.145 150 155 160 



Glu Lys Tyr Arg Gly Glu Phe Gly Arg Trp Ser Ser Glu lie Leu Val 
165 170 175 



Met Asp Pro Glu Arg Ala Glu Gin Arg Gly Gin Ala Ala Arg Glu Val 
180 185 190 



Val Asn Phe lie Leu Asp Leu Val Glu Arg Arg Arg Thr Glu Pro Gly 
195 200 205 



Asp Asp Leu Leu Ser Ala Leu He Arg Val Gin Asp Asp Asp Asp Gly 
210 215 220 



Arg Leu Ser Ala Asp Glu Leu Thr Ser lie Ala Leu Val Leu Leu Leu 
225 230 235 240 



Ala Gly Phe Glu Ala Ser Val Ser Leu He Gly He Gly Thr Tyr Leu 
245 250 255 



Leu Leu Thr His Pro Asp Gin Leu Ala Leu Val Arg Arg Asp Pro Ser 
260 265 270 



Ala Leu Pro Asn Ala Val Glu Glu He Leu Arg Tyr lie Ala Pro Pro 
275 280 285 



Glu Thr Thr Thr Arg Phe Ala Ala Glu Glu Val Glu He Gly Gly Val 
290 295 300 
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Ala lie Pro Gin Tyr Ser Thr Val Leu Val Ala Asn Gly Ala Ala Asn 
305 310 315 320 



Arg Asp Pro Lys Gin Phe Pro Asp Pro His Arg Phe Asp Val Thr Arg 
325 330 335 



Asp Thr Arg Gly His Leu Ser Phe Gly Gin Gly lie His Phe Cys Met 
340 * 345 350 



Gly Arg Pro Leu Ala Lys Leu Glu Gly Glu Val Ala Leu Arg Ala Leu* 
355 360 365 



Phe Gly Ar§ Phe Pro Ala Leu Ser Leu Gly lie Asp Ala Asp Asp Val 
370 375 380 



Val Trp Arg Arg Ser Leu Leu Leu Arg Gly lie Asp His Leu Pro Val 
385 390 395 400 



Arg Leu Asp Gly 
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